Code Monkey home page Code Monkey logo

Comments (7)

danielwestermann avatar danielwestermann commented on May 12, 2024 1

Thanks Dimitri,

I was not aware about this documentation. Maybe it would be a good idea to add it to the README.md. I will check what you recommended and let you know how it worked out.

Best regards
Daniel

from pg_auto_failover.

danielwestermann avatar danielwestermann commented on May 12, 2024 1

Hi Dimitri,

when installing from source and you are on CentOS/RedHat 8 you should add the following to the documentation, otherwise the systemd service will not start:

sudo semanage fcontext -a -t bin_t [PATH_TO]/pg_autoctl
restorecon -v [PATH_TO]/pg_autoctl

The state is reported fine now, that was an issue with the DNS.

Best regards
Daniel

from pg_auto_failover.

DimCitus avatar DimCitus commented on May 12, 2024

Hi Daniel! Thanks for your interest in pg_auto_failover and for your patience while the team was busy and traveling!

The CATCHINGUP state is documented with the others at https://pg-auto-failover.readthedocs.io/en/latest/fsm.html#state-reference and we can read:

The monitor assigns catchingup to the standby node when the primary is ready for a replication connection (pg_hba.conf has been properly edited, connection role added, etc).

The standby node keeper runs pg_basebackup, connecting to the primary’s nodename and port. The keeper then edits recovery.conf and starts PostgreSQL in hot standby node.

So the replication is now ongoing and the monitor needs some more information before it can ask the primary to add the secondary to the synchronous_standby_names. Namely:

  1. the monitor health check must succeed, you can have a look at pgautofailover.node table for more information
  2. the WAL lag must be kept under 1 WAL segment, by default, you can change this setting on the Monitor side (https://pg-auto-failover.readthedocs.io/en/latest/ref/configuration.html#pg-auto-failover-monitor).

I guess that one of those elements has not converged to the acceptable state yet at the moment when you're still catchingup. As soon as those will resolve, the monitor will assign roles primary and secondary and the replication will switch to SYNC.

from pg_auto_failover.

dimitri avatar dimitri commented on May 12, 2024

I'm not sure where to add the extra SElinux setup commands in the documentation. Would that be somewhere in https://pg-auto-failover.readthedocs.io/en/latest/install.html#installing-a-pgautofailover-systemd-unit then?

from pg_auto_failover.

danielwestermann avatar danielwestermann commented on May 12, 2024

from pg_auto_failover.

danielwestermann avatar danielwestermann commented on May 12, 2024

Now I remember why I used the --nodename parameter:

postgres@pg-af3 ~]$ ./_reset.sh 
17:51:00 INFO  Using --nodename "pg-af3", which resolves to IP address "192.168.22.72"
17:51:02 INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/12/af"
17:51:02 INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/af --options "-p 5432" --options "-h *" --wait start
17:51:03 INFO  Granting connection privileges on 192.168.22.0/24
17:51:03 INFO  Your pg_auto_failover monitor instance is now ready on port 5432.
17:51:03 INFO  pg_auto_failover monitor is ready at postgres://autoctl_node@pg-af3:5432/pg_auto_failover
17:51:03 INFO  Monitor has been succesfully initialized.
17:51:03 INFO  Found pg_ctl for PostgreSQL 12.0 at /u01/app/postgres/product/12/db_0/bin/pg_ctl
17:51:03 INFO  Registered node pg-af1.it.dbi-services.com:5432 with id 1 in formation "default", group 0.
17:51:03 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init"
17:51:03 INFO  Successfully registered as "single" to the monitor.
17:51:05 INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/12/PG1"
17:51:05 INFO  Postgres is not running, starting postgres
17:51:05 INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
17:51:05 INFO  CREATE DATABASE postgres;
17:51:05 INFO  The database "postgres" already exists, skipping.
17:51:06 INFO  FSM transition from "init" to "single": Start as a single node
17:51:06 INFO  Initialising postgres as a primary
17:51:06 INFO  Transition complete: current state is now "single"
17:51:06 INFO  Keeper has been succesfully initialized.
17:51:06 WARN  Failed to connect: Permission denied
17:51:06 FATAL Failed to find a local IP address, please provide --nodename.
17:51:06 FATAL Failed to auto-detect the hostname of this machine, please provide one via --nodename
17:51:07 WARN  Failed to connect: Permission denied
17:51:07 FATAL Failed to find a local IP address, please provide --nodename.
17:51:07 FATAL Failed to auto-detect the hostname of this machine, please provide one via --nodename
                      Name |   Port | Group |  Node |     Current State |    Assigned State
---------------------------+--------+-------+-------+-------------------+------------------
pg-af1.it.dbi-services.com |   5432 |     0 |     1 |            single |            single

[postgres@pg-af3 ~]$ 

Without the nodename the initialization of the replica fails with:


17:51:06 WARN  Failed to connect: Permission denied
17:51:06 FATAL Failed to find a local IP address, please provide --nodename.
17:51:06 FATAL Failed to auto-detect the hostname of this machine, please provide one via --nodename

from pg_auto_failover.

DimCitus avatar DimCitus commented on May 12, 2024

Closing this issue for triaging purposes. Feel free to re-open of more actions are expected.

from pg_auto_failover.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.