Code Monkey home page Code Monkey logo

keepalived's Introduction

keepalived: Loadbalancing & High-Availability

GitHub Sponsor Keepalived CI Coverity Status Language grade: C/C++ Total alerts keepalived Twitter Follow

The main goal of this project is to provide simple and robust facilities for loadbalancing and high-availability to Linux system and Linux based infrastructures. Loadbalancing framework relies on well-known and widely used Linux Virtual Server (IPVS) kernel module providing Layer4 loadbalancing. Keepalived implements a set of checkers to dynamically and adaptively maintain and manage loadbalanced server pool according their health. On the other hand high-availability is achieved by the Virtual Router Redundancy Protocol (VRRP). VRRP is a fundamental brick for router failover. In addition, Keepalived implements a set of hooks to the VRRP finite state machine providing low-level and high-speed protocol interactions. In order to offer fastest network failure detection, Keepalived implements the Bidirectional Forwarding Detection (BFD) protocol. VRRP state transition can take into account BFD hints to drive fast state transition. Keepalived frameworks can be used independently or all together to provide resilient infrastructures.

Keepalived implementation is based on an I/O multiplexer to handle a strong multi-threading framework. All the events process use this I/O multiplexer.

To build keepalived from the git source tree, you will need to have autoconf, automake and various libraries installed. See the INSTALL file for details of what needs to be installed and what needs to be executed before building keepalived.

Keepalived is free software, Copyright (C) Alexandre Cassen. See the file COPYING for copying conditions.

OPENSSL TOOLKIT LICENCE EXCEPTION

In addition, as the copyright holder of Keepalived, I, Alexandre Cassen, [email protected], grant the following special exception:

I, Alexandre Cassen, <[email protected]>, explicitly allow
the compilation and distribution of the Keepalived software with
the OpenSSL Toolkit.

keepalived's People

Contributors

acassen avatar alandewar avatar andempsey avatar andriyanov avatar dclabaut-ovh avatar dev-itsheng avatar flexiondotorg avatar flygoast avatar frankbb avatar henkworks avatar hw-lj avatar ivoronin avatar jonasj76 avatar jsouthworth avatar kokke avatar kwb0523 avatar louis-6wind avatar markshuttle avatar pandax381 avatar pommi avatar ppenzo avatar pqarmitage avatar rik78 avatar rohara avatar rubenk avatar tamutamu avatar thresheek avatar toreanderson avatar vincentbernat avatar zhouxudong199 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keepalived's Issues

After reload and then restart, keepalived does not apply the configuration. In keepalived 1.2.7 - 1.2.9

How to reproduce:

  1. Change config in configuration file(i.e.):
    --------- snip-----------
    virtual_ipaddress {
    10.241.238.170/32 dev br1
    10.241.238.188/32 dev br1
    }
    -------- snip------------
    to
    --------- snip-----------
    virtual_ipaddress {
    10.241.238.170/32 dev br1 label br1:1
    10.241.238.188/32 dev br1 label br1:4
    }
    -------- snip----------
  2. Reload keepalived:
    /etc/init.d/keepalived reload
    Reloading keepalived: [ OK ]
  3. Check changes:
    ip a |grep br1
    5: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    inet 10.241.238.212/24 brd 10.241.238.255 scope global br1
    inet 10.241.238.170/32 scope global br1
    inet 10.241.238.188/32 scope global br1
  4. No changes, ok restart it:
    /etc/init.d/keepalived restart
    Stopping keepalived: [ OK ]
    Starting keepalived: [ OK ]
  5. Check changes again:
    ip a |grep br1
    5: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    inet 10.241.238.212/24 brd 10.241.238.255 scope global br1
    inet 10.241.238.170/32 scope global br1
    inet 10.241.238.188/32 scope global br1

No changes again, this is really strange....

  1. Ok, stop it and check addresses:
    /etc/init.d/keepalived stop
    Stopping keepalived: [ OK ]

ip a |grep br1
5: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
inet 10.241.238.212/24 brd 10.241.238.255 scope global br1
inet 10.241.238.170/32 scope global br1
inet 10.241.238.188/32 scope global br1

Service is stopped, but the virtual addresses are still present ((((

Updated keepalived.conf.SYNOPSIS with latest changes

Since the documentation for keepalived is mostly outdated, it would be really nice if the keepalived.conf.SYNOPSIS is up to date with the latest changes. If it is already up to date, then all is good and close this, but I guess it is outdated, since for protocol it says Only TCP is implemented, but UDP works as well.

VRRP preempt_delay broken

Hi, current git master has two bugs relating to the preempt_delay setting. They were both introduced by commit c7a985d:

commit c7a985db4136b374dbee25caa39aa121bc16fb7d
Author: Joachim Nilsson <[email protected]>
Date:   Tue Sep 10 15:36:36 2013 +0200

Honor preempt_delay setting on startup.

This is a fix to honor the preempt_delay setting on power-up, or reboot, by
preventing a BACKUP router to transition to MASTER until its preempt timer
has expired.

Bug 1 is that preempt_delay comes into play at startup even when there are no other active VRRP speakers on the link. The keepalived.conf manual page is very clear that preemption means something different, namely to «preempt a lower priority machine when a higher priority machine comes online». This is also what I understand the English word "preemption" to mean, i.e., to take over something from someone else.

The negative consequence of this bug is that if you're recovering from an outage that have impacted all VRRP speakers simultaneously (for example a power loss), you prolong the outage for the duration of preempt_delay, since none of them will go to MASTER state before preempt_delay has expired.

From the commit log I do realise that the patch is probably «working as intended», but in my opinion changing the meaning of preempt_delay in this manner is the wrong thing to do - the functionality should instead have been implemented as a brand new setting instead called "startup_delay" or "initial_delay" or something like that. If it really is necessary to implement this in keepalived at all - it appears to me that doing "sleep && keepalived" in the init script would accomplish exactly the same thing.

Bug 2 is far more critical, namely that preempt_delay is used at reload, irrespective of the current state of the VRRP instance. So if you have an VRRP instance in the MASTER state, and send keepalived SIGHUP, it will sit there with the virtual addresses active for the duration of preempt_delay before it will resume sending out VRRP advertisements. If preempt_delay is set to something higher than a couple of seconds at most, another VRRP speaker will notice the absence of advertisements and transition to MASTER state, and you'll have a "split-brain" active/active situation, which will last until preempt_delay has expired and the node starts transmitting VRRP advertisements again.

I propose simply reverting the problematic commit. This solves both problems.

The following logs demonstrate the problem being reproduced. There is only one VRRP speaker on the link. Configuration is as follows:

vrrp_instance eth1 {
    interface eth1
    virtual_router_id 10
    preempt_delay 30
    virtual_ipaddress {
        192.168.1.1/30
    }
}
Mar  6 11:13:04 ucstest Keepalived[21800]: Starting Keepalived v1.2.12 (03/03,2014)
Mar  6 11:13:04 ucstest Keepalived[21801]: Starting Healthcheck child process, pid=21802
Mar  6 11:13:04 ucstest Keepalived[21801]: Starting VRRP child process, pid=21804
Mar  6 11:13:04 ucstest Keepalived_healthcheckers[21802]: Initializing ipvs 2.6
Mar  6 11:13:04 ucstest Keepalived_vrrp[21804]: Registering Kernel netlink reflector
Mar  6 11:13:04 ucstest Keepalived_vrrp[21804]: Registering Kernel netlink command channel
Mar  6 11:13:04 ucstest Keepalived_vrrp[21804]: Registering gratuitous ARP shared channel
Mar  6 11:13:04 ucstest Keepalived_vrrp[21804]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  6 11:13:04 ucstest Keepalived_vrrp[21804]: Configuration is using : 59582 Bytes
Mar  6 11:13:04 ucstest Keepalived_vrrp[21804]: Using LinkWatch kernel netlink reflector...
Mar  6 11:13:04 ucstest Keepalived_healthcheckers[21802]: IPVS: Can't initialize ipvs: Protocol not available
Mar  6 11:13:04 ucstest Keepalived_vrrp[21804]: VRRP_Instance(eth1) Entering BACKUP STATE
Mar  6 11:13:04 ucstest Keepalived_healthcheckers[21802]: Registering Kernel netlink reflector
Mar  6 11:13:04 ucstest Keepalived_healthcheckers[21802]: Registering Kernel netlink command channel
Mar  6 11:13:04 ucstest Keepalived_healthcheckers[21802]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  6 11:13:04 ucstest Keepalived_healthcheckers[21802]: Configuration is using : 4713 Bytes
Mar  6 11:13:04 ucstest Keepalived_healthcheckers[21802]: Using LinkWatch kernel netlink reflector...

Bug 1: The delay at this point. There are no other VRRP speakers on the link, so there are nobody to preempt, yet preempt_delay comes into play.

Mar  6 11:13:34 ucstest Keepalived_vrrp[21804]: VRRP_Instance(eth1) Transition to MASTER STATE
Mar  6 11:13:35 ucstest Keepalived_vrrp[21804]: VRRP_Instance(eth1) Entering MASTER STATE

The virtual address gets addded to the interface at this point.

Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: Got SIGHUP, reloading checker configuration
Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: Initializing ipvs 2.6
Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: IPVS: Can't initialize ipvs: Protocol not available
Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: Registering Kernel netlink reflector
Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: Registering Kernel netlink command channel
Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: Configuration is using : 3601 Bytes
Mar  6 11:14:54 ucstest Keepalived_healthcheckers[21802]: Using LinkWatch kernel netlink reflector...
Mar  6 11:14:55 ucstest Keepalived_vrrp[21804]: Registering Kernel netlink reflector
Mar  6 11:14:55 ucstest Keepalived_vrrp[21804]: Registering Kernel netlink command channel
Mar  6 11:14:55 ucstest Keepalived_vrrp[21804]: Registering gratuitous ARP shared channel
Mar  6 11:14:55 ucstest Keepalived_vrrp[21804]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  6 11:14:55 ucstest Keepalived_vrrp[21804]: Configuration is using : 58423 Bytes
Mar  6 11:14:55 ucstest Keepalived_vrrp[21804]: Using LinkWatch kernel netlink reflector...
Mar  6 11:14:55 ucstest Keepalived_vrrp[21804]: cant do IP_DROP_MEMBERSHIP errno=Bad file descriptor (9)

Bug 2: The delay at this point. The virtual address remains configured on eth1, but no VRRP advertisements are transmitted. Had there been another VRRP speaker on the link we would have had a split-brain situation during this delay (except for the first few seconds.

Mar  6 11:15:25 ucstest Keepalived_vrrp[21804]: VRRP_Instance(eth1) Transition to MASTER STATE

Election always elect the same node between same state/priority nodes

I'm spawning 2 EC2 instances in amazon with Keepalived v1.2.13. They have the same priority, as I do not want flapping when a MASTER goes down, then back up.
I use unicast, IP are 172.17.16.10 and .11
Regardless of the state (I have tried BACKUP+BACKUP, MASTER + BACKUP and vice-versa) or which instance came up first, the instance with IP .11 always enters MASTER as soon as both instances exchange unicast.

This makes me think the IP / fingerprint has something to do in an election with same priority.

I have inverted the IP addresses used on both instances, and I can confirm that .11 stills gets elected.

keepalived does not remove VIP addrs when it is stopped

This bug has been around for years. When keepalived is stopped, it does not remove the VIP's that it added.

bash-4.3$ ip -4 addr ls dev lan0.3003
5: lan0.3003@lan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
inet 192.168.53.139/24 brd 192.168.53.255 scope global lan0.3003
valid_lft forever preferred_lft forever

bash-4.3$ cat /tmp/keepalived.conf
vrrp_instance pcr-ny4-mktdata-relay {
state MASTER
interface lan0.3003
virtual_router_id 99
priority 200
advert_int 1
authentication {
auth_type PASS
auth_pass kaTest
}
virtual_ipaddress {
192.168.53.250/24 brd 192.168.53.255 dev lan0.3003
}
notify /tmp/katest.sh
}

bash-4.3$ /extra_disk/tmp/katest/sbin/keepalived -f /tmp/keepalived.conf --vrrp -l -D -n -d
Starting VRRP child process, pid=22258
Netlink reflector reports IP 192.168.50.139 added
Netlink reflector reports IP 10.25.3.133 added
Netlink reflector reports IP 192.168.53.139 added
Netlink reflector reports IP 192.168.55.139 added
Netlink reflector reports IP 192.168.61.139 added
Netlink reflector reports IP 10.211.0.5 added
Netlink reflector reports IP 192.168.63.139 added
Netlink reflector reports IP 192.168.62.139 added
Netlink reflector reports IP 192.168.60.139 added
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/tmp/keepalived.conf'.
Configuration is using : 64263 Bytes
------< Global definitions >------
Router ID = ti139
VRRP IPv4 mcast group = 224.0.0.18
VRRP IPv6 mcast group = ff02::12
------< VRRP Topology >------
VRRP Instance = pcr-ny4-mktdata-relay
Using VRRPv2
Want State = MASTER
Runing on device = lan0.3003
Gratuitous ARP repeat = 5
Gratuitous ARP refresh repeat = 1
Virtual Router ID = 99
Priority = 200
Advert interval = 1 sec

Accept disabled
Authentication type = SIMPLE_PASSWORD
Password = kaTest
Virtual IP = 1
192.168.53.250/24 brd 192.168.53.255 dev lan0.3003 scope global
Generic state transition script = '/tmp/katest.sh'
Using LinkWatch kernel netlink reflector...
VRRP sockpool: [ifindex(5), proto(112), unicast(0), fd(10,11)]
VRRP_Instance(pcr-ny4-mktdata-relay) Transition to MASTER STATE
VRRP_Instance(pcr-ny4-mktdata-relay) Entering MASTER STATE
VRRP_Instance(pcr-ny4-mktdata-relay) setting protocol VIPs.
VRRP_Instance(pcr-ny4-mktdata-relay) Sending gratuitous ARPs on lan0.3003 for 192.168.53.250
Opening script file /tmp/katest.sh
VRRP_Instance(pcr-ny4-mktdata-relay) Sending gratuitous ARPs on lan0.3003 for 192.168.53.250
^CStopping Keepalived v1.2.17 (06/18,2015)

bash-4.3$ ip -4 addr ls dev lan0.3003
5: lan0.3003@lan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
inet 192.168.53.139/24 brd 192.168.53.255 scope global lan0.3003
valid_lft forever preferred_lft forever
inet 192.168.53.250/24 brd 192.168.53.255 scope global secondary lan0.3003
valid_lft forever preferred_lft forever

Is it possible that no GARP was sent from a master that received a lower prio advert?

Hi,

I've spent most of the morning trying to debug an issue with our keepalived setup and the only conclusion that i've been able to come up with is the following (unfortunately there's mention of something that sounds just like this in the changelog from 2002... so hrm!):

We have two instances of keepalived running on virtual machines.

This morning at 09:39:29 the slave instance decided that it was going to take over as master (fair enough, that's what it's supposed to do if it thinks the other end is dead right?).

The slave logged a whole bunch of Transition to MASTER STATE entries, then began logging Entering MASTER STATE. There's 20 VRRP Instances and just before the last VRRP instance entered master state, it logged that this instance had received a higher prio advert (This is at the point where the existing master wondered what on earth was going on and began logging: Received lower prio advert, forcing new election). At 09:39:30 all of the instances on the slave returned to BACKUP state, the master logged nothing more as presumably all its instances stayed in MASTER state.

At this point everything should have been fine, but it was not. Our alerting started complaining that almost everything was broken (this specific VMs are routers between a bunch of networks).

Some time later (and some tcpdumping later), I worked out that the reason why nothing was working was that some packets were travelling via the master and others were attempting to travel via the slave (and therefore being rejected).

An example is, I could see a SYN from a request to one of our services traverse the master and hit our load balancers. The SYN being sent on from the load balancers (we also run keepalived and use ipvs on our load balancers) came back through the router for the packet to hit the real server. The SYN ACK from the real server was nowhere to be seen on the master, however it did appear on the slave (and therefore was rejected).

Now I can't prove this....

(As shortly afterwards I shutdown keepalived on the master (allowing failover to the slave to happen properly), confirmed everything worked when the slave was now the master, then started keepalived on the original master and let everything fail back. Confirming that now everything worked as expected again.)

But all I can think is that the machines (ie my real servers, not that this is important here), had the incorrect MAC address of their default gateway in their ARP cache. As the packets were hitting my slave machine, I assume that during the short period where the slave also had all of the vrrp instances in MASTER state, some of my machines had arped for their default gateways and received the MAC addresses from the slave. When the master forced an election causing the slave to relinquish control again, I have a feeling that no GARP was sent out by the master meaning that some of my machines had the wrong MAC address in their cache.

Like I said, I can't prove that as in restating keepalived (and therefore everything being fixed) I no longer had the ability to go see what was in various arp caches.

But I can't think of any reason why some machines would have been sending packet to the wrong locations other than this!

FWIW, my setup was in this state for a good 30 mins until restarted keepalived on the original master. I've been trying to work out how long things take to be timed out of the arp cache or renewed if they've been in there for a while... but I can't really work it out. I think, if something is continuously used (ie. a default gateway) then it will never time out and will never be renewed. But I could be misunderstanding that.

As suggested at the top, there's something in the Changelog that sounds very similar to what I describe above. But it's from 2002.

I'm using Ubuntu packages of Keepalived, version 1.1.20.

Thanks :)

trouble with garp_master_refresh

Hallo
I set garp_master_refresh in 14400, but keepalived set this counter in 1515(2014-05-23_10:06:22.81095 Gratuitous ARP refresh timer = 1515)
I think it's because:
vrrp->garp_refresh = atoi(vector_slot(strvec, 1)) * TIMER_HZ;
where
grep garp_refresh include/vrrp.h
int garp_refresh; /* Next scheduled gratuitous ARP refresh */

and

define TIMER_HZ 1000000

IPSEC-AH auth_type on different network segments

Hello,

I stumbled across an issue when trying to switch from PASS to IPSEC-AH with the unicast extension and Keepalived v1.2.12:

lb1 Keepalived_vrrp[28581]: bogus VRRP packet received on eth0 !!!
lb1 Keepalived_vrrp[28581]: VRRP_Instance(VI_2) Dropping received VRRP packet...
lb1 Keepalived_vrrp[28581]: VRRP_Instance(VI_2) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
lb2 Keepalived_vrrp[31100]: bogus VRRP packet received on eth0 !!!
lb2 Keepalived_vrrp[31100]: VRRP_Instance(VI_1) ignoring received advertisment...
lb2 Keepalived_vrrp[31100]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password 

Passwords were the same. Looked at my other deployments and it works fine with v1.2.9 if unicast_peer(s) are on the same network segment, and it also works fine when using v1.2.12 PASS auth_type on different segments (the sanity check was fixed in 1.2.10 iirc).

I realize that the latest release is 1.2.13, but didn't find anything in the changelog relevant to this problem.

Keepalived doesn't execute any notify or smtp_altert from Backup to Fault state

Hello,

We are currently seeing the following problem on version 1.2.7 RHEL6 RPM (also tested with my own RPM version 1.2.13).

"notify", "notify_fault" and "smtp_alert" doesn't fire when Keepalived is in Backup state and the vrrp_script fail. We can see in the log: Keepalived_vrrp[29331]: VRRP_Script(check_mysql) failed but nothing happen.

I have even tested by using the splitted version of "notify_*" instead of "notify" (doesn't work also).

This is really important for us, we use notify to track status of VRRP and we discover that presumed "Backup" instance is actually in "FAULT" state only when you switchover role (server01 to server02 for example).

Thanks for your help on this problem.

Here is my configuration :

! Configuration File for keepalived

global_defs {
notification_email {
[email protected]
}
notification_email_from server01
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id server01

}

vrrp_script check_mysql {
script "/root/mysql_keepalived.check"
interval 3
rise 3
fall 3
#weight 2
}

vrrp_instance VI_1 {
state BACKUP
nopreempt
interface eth2
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.120.10.80/16 dev eth2 label eth2:mysql
}
#track_interface {
# eth0
#}

track_script {
check_mysql
}

notify_master "/root/master.sh"
notify_backup "/root/backup.sh"
notify_fault "/root/faulty.sh"
notify "/root/keepalived_update_status"
smtp_alert
}

Example Faulty.sh:

!/bin/bash

echo "date - i am faulty" >> /tmp/keepalived_log

Question: execute notify after "Received lower prio advert"

I've installed Keepalived to create a HA pair of Nginx load balancers. I'm hosting them on OVH and switching a failover IP from one server to another requires an API call. I've created a script to make those API calls using the notify_master directive.

This has been working nicely until last night. The backup server became master for a few seconds (probably because of a packet loss or something which impeded it from seeing the VRRP announcements from the master), causing the following message on the real master: "Received lower prio advert, forcing new election". The server went back to backup state but, as there were no changes in the primary server, the last execution of the script was from the secondary one, causing OVH to route the IPS to the secondary server while the interfaces where configured on the primary one.

Is there a way to make keepalived execute the script after forcing a new election?

Here I attach the logs and config files for better understanding of my problem.

root@glados:~# salt -G 'roles:lb' cmd.run 'grep -i keepalived /var/log/syslog.1'
glados.xxxx.com:
    Jul 13 23:46:44 glados Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Received lower prio advert, forcing new election
    Jul 13 23:47:01  Keepalived_vrrp: last message repeated 6 times
borg.xxxx.com:
    Jul 13 23:46:44 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Transition to MASTER STATE
    Jul 13 23:46:45 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Entering MASTER STATE
    Jul 13 23:46:45 borg Keepalived_vrrp: Opening script file /usr/local/src/keepalived/take-ips.py
    Jul 13 23:46:48 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Received higher prio advert
    Jul 13 23:46:48 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Entering BACKUP STATE
    Jul 13 23:46:51 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Transition to MASTER STATE
    Jul 13 23:46:52 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Entering MASTER STATE
    Jul 13 23:46:52 borg Keepalived_vrrp: Opening script file /usr/local/src/keepalived/take-ips.py
    Jul 13 23:46:52 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Received higher prio advert
    Jul 13 23:46:52 borg Keepalived_vrrp: VRRP_Instance(LOAD_BALANCER) Entering BACKUP STATE
# keepalived.conf in glados (primary server)
vrrp_instance LOAD_BALANCER {
  nopreempt
  interface eth1
  state MASTER
  virtual_router_id 51
  priority 100

  virtual_ipaddress_excluded {
    ***.***.***.** dev eth1 label eth1:1
    ***.***.***.** dev eth1 label eth1:2
    ***.***.***.** dev eth1 label eth1:3
    ***.***.***.** dev eth1 label eth1:4
    ***.***.***.** dev eth1 label eth1:5
    ***.***.***.** dev eth1 label eth1:6
    ***.***.***.** dev eth1 label eth1:7
    ***.***.***.** dev eth1 label eth1:8
    ***.***.***.** dev eth1 label eth1:9
    ***.***.***.** dev eth1 label eth1:10

  }

  notify_master "/usr/local/src/keepalived/take-ips.py"
}
# keepalived.conf in borg (secondary server)
vrrp_instance LOAD_BALANCER {
  nopreempt
  interface eth1
  state MASTER
  virtual_router_id 51
  priority 90

  virtual_ipaddress_excluded {
    ***.***.***.** dev eth1 label eth1:1
    ***.***.***.** dev eth1 label eth1:2
    ***.***.***.** dev eth1 label eth1:3
    ***.***.***.** dev eth1 label eth1:4
    ***.***.***.** dev eth1 label eth1:5
    ***.***.***.** dev eth1 label eth1:6
    ***.***.***.** dev eth1 label eth1:7
    ***.***.***.** dev eth1 label eth1:8
    ***.***.***.** dev eth1 label eth1:9
    ***.***.***.** dev eth1 label eth1:10

  }

  notify_master "/usr/local/src/keepalived/take-ips.py"
}

Only ping master agent every 2 minutes

There is a piece of code to make the subagent inside keepalived ping the master agent every 2 minutes. It seems that this code is not effective. Here is a patch moving the code a bit later to make it work as expected.

Keepalived won't see second master (EQUAL state)

Greetings,

I have strange problem with keepalived 1.2.7 on CentOS 6.5 x86.

Description:
Keepalived with EQUAL state enabled won't recognize second master and initialize re-election. Instead it was keeping VIP binded on both nodes.

Reproducible: 100%

Steps to Reproduce:

  1. Disable iptables;
  2. Enable STP on Cisco-switch;
  3. Unplug cables from switch;
  4. Plug cables to switch and run keepalived with following config: http://pastebin.com/TVRnZDq3
  5. Wait for about 40 second to allow STP to complete;
  6. Run tcpdump and you will get following results: http://pastebin.com/0MMky6rq

Actual results:
Multicast VRRP traffic is seen by both of nodes but they continue to keep VIP.

Expected results:
One of node must unbind VIP.

Problems with GARP

Hi!
I have a problem with GARP in my HA VPN gateway scenario.
Timeline of problem (M - main gateway, S - backup gateway with lower priority)

  • S router stops receiving vrrp multicast packets from master (for any reason)
  • S transition to master
  • S sends GARP
  • S receives high prio vrrp packet from M
  • S transition to slave
    (All these steps above take less than 20 seconds)

result:

  • M doesn't receive any vrrp packets from S so doesn't force election.
  • all routers receives GARP from S and sends all packets to MAC-addresses of S, but S is doesn't act as VPN gateway at this moment
    And Cisco IOS have arp cache defaults to 4 hours so all network stops working for 4 hours

Can I fix this problem via some keepalived settings? I know about Cisco IOS arp timeout settings, but can't change settings of all routers.
Can You add an option to keepalived to send GARP from master every X seconds?

Setting of default route broken in keepalived 1.2.8 and 1.2.9

Hi,

I've been using keepalived 1.2.7 successfully for over a year now to implement VRRP on our primary and backup firewalls.

My setup therefore includes sections like:

virtual_routes {
    default via 1.2.3.4 dev vrrp_extern_0
}

After upgrading to keepalived 1.2.8 and, subsequently, to 1.2.9, these route definitions stopped working, with keepalived complaining about default not being a valid ip-address.

I replaced the word default by 0.0.0.0/0 only to realize that there seems to be a bug in prefix handling, too. In fact, according to /proc/net/route, the route set by keepalived was 0.0.0.0/32, thus being a blackhole instead of a default route definition.

I was able to circumvent the problem by setting up notify_master and notify_backup scripts that set and remove the necessary route entries by invoking the ip route command. However, I consider the above behavior as being a serious bug that needs to be fixed.

Best,
Torsten

VRRP Advertisement source mac is incorrect

Hi

The master router should use the source mac of 00-00-5e-00-01-XX <-- for VRID XX. We are seeing that the source mac is using the physical mac address of the bound interface for vip.

Could you please verify?

Keepalived and Openvpn. Errors in logs.

Greetings.

I have two routers running debian linux and keepalived 1.2.12, it was working flawlessly. Recently I've installed OpenVPN, and noticed this kind of messages in syslog:
May 5 12:22:28 rt Keepalived_vrrp[4497]: Netlink: filter function error
May 5 12:22:28 rt Keepalived_vrrp[4497]: Netlink: filter function error

Messages appears all the time when OpenVPN interface goes up or down. Keepalived configured to do nothing with ovpn interfaces, but looks like it's trying to run netlink_parse_info function with tun interface.
Yet it works well, and messages are only reason to worry. Did anyone got anything alike? Is it serious? Should be be fixed some way, or I can ignore it?

Thank you.

Keepalived on Amazon EC2 Classic instances

Hi,

Not sure if this is an issue or just me not configuring correctly, anyone else had this issue?

Im trying to set up keepalived (with unicast) over 2 Amazon EC2 instances. Both running CentOS 6.5. HA Proxy is installed on both machines and 1 elastic IP is shared between the two of them.

with HAProxy running on both instances, I start Keepalived on both instances and they both start up in master state (split brain). IPTables are turned off and security group is configured to allow traffic so both instances should be able to broadbast to each other.

My Keepalived config script for LB1:

vrrp_script chk_haproxy {
script "killall -0 haproxy" # cheaper than pidof
interval 2 # check every 2 seconds
}

vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 101
track_script {
chk_haproxy weight 2 # +2 if process is present
}

unicast_src_ip 10.x.x.x # IP for this instance
unicast_peer { # Do not use multicast, instead send VRRP
  10.x.x.x # IP for LB2 instance
}
notify_master "/etc/keepalived/vrrp.sh"

}

Keepalived Script for LB2:

vrrp_script chk_haproxy {
script "killall -0 haproxy" # cheaper than pidof
interval 2 # check every 2 seconds
}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 100
track_script {
chk_haproxy weight 2 # +2 if process is present
}

unicast_src_ip 10.x.x.x # IP for this instance
unicast_peer { # Do not use multicast, instead send VRRP
  10.x.x.x # IP for LB1 Instance
}
notify_master "/etc/keepalived/vrrp.sh"

}

Is there anything here that I have missed?

Thanks

1.2.6 doesn't work with vrrp sync groups and check scripts

Hi

This simple config doesn't work for me with 1.2.6, but work with 1.2.5

vrrp_sync_group G1 {
group {
VI_0
}
}

vrrp_script chk_nginx { # Requires keepalived-1.1.13
script "killall -0 nginx" # widely used idiom
interval 2 # check every 2 seconds
fall 2
rise 2
}
vrrp_instance VI_0 {
interface eth0
state MASTER
priority 100
virtual_router_id 52
virtual_ipaddress {
10.2.0.201
}
track_script {
chk_nginx
}
}

keepalived 1.2.6 work with this config only if I disable track_script or disable vrrp_sync_group.

Secondary addresses not ping-able

Hi,

I've successfully set up keepalived's VRRP component to implement an active/standby firewall configuration.

Everything works as expected, however one thing puzzles me:

We've got 64 external IP addresses, thus in my keepalived.conf, I've got sth. like:

# Primary IP
virtual_ipaddress {
     1.2.3.1/26 dev vrrp_extern_0
}

# Secondary IPs (got several of those blocks, each with 20 addresses)
virtual_ipaddress_excluded {
    1.2.3.2/26 dev vrrp_extern_0
    1.2.3.3/26 dev vrrp_extern_0
    1.2.3.4/26 dev vrrp_extern_0
    ...

Additionally, there's an iptables rule to open up ICMP:

iptables -t filter -A INPUT -p icmp -j ACCEPT

Now, from the outside, if I ping the primary address, I get a reply - everything works smoothly.
If, on the other hand, I ping one of the secondary addresses, I don't get a reply.

What am I doing wrong here?

Best,
Torsten

PS: iptables OUTPUT policy is ACCEPT.

Keepalived fails when number of interfaces goes over 31

When you define more then 31 interfaces/vrrp_instances in keepalived.conf, you start getting spurious failures ( some obviously UP interfaces getting reported as DOWN, leading to FAIL state ).

This seems to start with this message:
Sep 16 14:56:50 fwemul Keepalived_vrrp[7475]: VRRP_Instance(VI_987) Entering BACKUP STATE
Sep 16 14:56:50 fwemul Keepalived_vrrp[7475]: Netlink: Received message overrun
and leads later to:
Sep 16 14:56:53 fwemul Keepalived_vrrp[7475]: Kernel is reporting: interface vrrp.57 DOWN
Sep 16 14:56:54 fwemul Keepalived_vrrp[7475]: VRRP_Instance(VI_987) Now in FAULT state

it doesn't matter if those instances are in one VRRP Group or if they are spread among multiple groups ( but if you create many groups, and keep the first group sized under 31, then all the failure occur in subsequent groups )

release tagging

Hi, just a quick request.

When making a new release, could you also do "git tag -a vx.x.x" and push the annotated tag up to Github?

This will make it easier to find the commit of a release both within git itself and on github's downloads section!

Ipv6 Unicast_peer without native_ipv6 throws error

We were testing with unicast_peer and found that when we configured an ipv6 address, it wouldn't work without adding the native_ipv6 flag. That flag seems a bit strange as keepalived should be able to detect an ipv6 address by itself (right?).

Without this flag; keepalived throws:

 VRRP_Instance(VI_1) Cant sent advert to xxx:xxx (Permission denied)

According to strace, keepalived sents the following message:

sendmsg(11, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("255.255.255.255")}, msg_iov(1)=[{"xxxx"..., 44}], msg_controllen=0, msg_flags=0}, 0) = -1 EACCES (Permission denied)

This issue is partially informational for other people experiencing the same error

Allow include directive in virtual_ipaddress section

Using the include directive inside a vrrp_instance -> virtual_ipaddress section, will not add the VIPs defined in the included file. The file is opened and read, but something prevents the VIPs from being allocated -- I'm not sure what. Putting the exact contents of the include file directly into the virtual_ipaddress section, works perfectly fine. This behaviour is confirmed on v1.2.9 and v1.2.12.

VRRP Sending packets with wrong sourceaddress

I have an issue with keepalived in Vyatta. I created a bug there (https://bugzilla.vyatta.com/show_bug.cgi?id=8452) but I thought it might be wise to create an issue here too.

The issue is that keepalived is ignoring the mcast_src_ip setting for a part of the configuration. The master is sending out packets with the virtual address instead. This completely goes haywire after a failover, where suddenly a virtual address on another interface is used to send the packets.

Please see the Vyatta bug for details and let me know if I'm doing something wrong or something needs testing.

Failure to release master

Two nodes on the same virtual router id both keep Master status and VIP.
The node below should defer to 10.112.0.112 as Master, but it doesn't.

keepalived.conf

vrrp_instance LAN112 {
    state MASTER
    interface eth1.112
    virtual_router_id 112
    priority 10
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.112.0.1
    }
}

tcpdump

09:05:20.465822 IP localhost.localdomain > vrrp.mcast.net: VRRPv2, Advertisement, vrid 112, prio 10, authtype simple, intvl 1s, length 20
09:05:20.466202 IP 10.112.0.112 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 112, prio 30, authtype simple, intvl 1s, length 20

syslog

Mar 30 09:21:53 localhost Keepalived_vrrp[29001]: VRRP_Instance(LAN112) Transition to MASTER STATE
Mar 30 09:21:54 localhost Keepalived_vrrp[29001]: VRRP_Instance(LAN112) Entering MASTER STATE
Mar 30 09:21:54 localhost Keepalived_vrrp[29001]: VRRP_Instance(LAN112) setting protocol VIPs.
Mar 30 09:21:54 localhost Keepalived_vrrp[29001]: VRRP_Instance(LAN112) Sending gratuitous ARPs on eth1.112 for 10.112.0.1

ip addr sh

5: eth1.112@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    inet 10.112.0.104/13 brd 10.119.255.255 scope global eth1.112
       valid_lft forever preferred_lft forever
    inet 10.112.0.1/32 scope global eth1.112
       valid_lft forever preferred_lft forever

spec bug about kernel variables

If I have two system kernel development version, and variable of kernel will have problem

my system version is centos 6.4

rpm -aq kernel-devel
kernel-devel-2.6.32-279.el6.x86_64
kernel-devel-2.6.32-358.11.1.el6.x86_64

when rpmbuild -bb keepalived.spec

error info :

Building ../bin/keepalived
vrrp/vrrp_netlink.o: In function netlink_if_address_filter': /opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp_netlink.c:546: undefined reference toupdate_checker_activity'
vrrp/vrrp.o: In function reset_vrrp_state': /opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp.c:1204: undefined reference toipvs_syncd_cmd'
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp.c:1209: undefined reference to ipvs_syncd_cmd' vrrp/vrrp.o: In functionshutdown_vrrp_instances':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp.c:1105: undefined reference to ipvs_syncd_cmd' vrrp/vrrp.o: In functionvrrp_state_leave_master':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp.c:787: undefined reference to ipvs_syncd_backup' vrrp/vrrp.o: In functionvrrp_state_become_master':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp.c:732: undefined reference to ipvs_syncd_master' vrrp/vrrp_scheduler.o: In functionvrrp_init_state':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp_scheduler.c:246: undefined reference to ipvs_syncd_cmd' vrrp/vrrp_scheduler.o: In functionvrrp_register_workers':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp_scheduler.c:235: undefined reference to ipvs_syncd_cmd' vrrp/vrrp_daemon.o: In functionstop_vrrp':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp_daemon.c:78: undefined reference to ipvs_stop' vrrp/vrrp_daemon.o: In functionstart_vrrp':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp_daemon.c:105: undefined reference to ipvs_start' vrrp/vrrp_daemon.o: In functionreload_vrrp_thread':
/opt/ddir/keepalived-1.2.2/keepalived/vrrp/vrrp_daemon.c:210: undefined reference to ipvs_stop' collect2: ld returned 1 exit status make[1]: *** [all] Error 1 make[1]: Leaving directory/root/rpmbuild/BUILD/keepalived-1.2.2/keepalived'
make: *** [all] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.OAAccg (%build)

maybe this is a small probability event

Keepalived 1.2.11 segfaults when starting

I just attempted upgrading keepalived to the newly released 1.2.11; unfortunately I had to rollback because it seems to segfault. This occured on 2 keepalived mahchines (which share their configuration). I have attempted debugging this but the way keepalived forks prevents me from obtaining a proper backtrace. This is the output from keepalived -DlRn

Starting Healthcheck child process, pid=2354
Starting VRRP child process, pid=2355
Netlink reflector reports IP xxx.xxx.xxx.xxx added
Initializing ipvs 2.6
Netlink reflector reports IP xxx.xxx.xxx.xxx added
Netlink reflector reports IP xxxx:xxxx:120:366 added
Netlink reflector reports IP fe80::xxxx:xxxx added
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Opening file '/etc/keepalived/keepalived.conf'.
Configuration is using : 7459 Bytes
Netlink reflector reports IP xxxx:xxxx:120:366 added
Netlink reflector reports IP fe80::xxxx:xxxx added
Using LinkWatch kernel netlink reflector...
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/etc/keepalived/keepalived.conf'.
*** Error in `/usr/bin/keepalived': free(): invalid next size (normal): 0x0000000000908300 ***

The configuration:

! Configuration File for keepalived

global_defs {
        notification_email {
                [email protected]
        }

        notification_email_from keepalived@server
        smtp_server xxx.xxx
        smtp_connect_timeout 30
        router_id LVS_RC
}

vrrp_script check_nginx {
        script "killall -0 nginx"
        interval 1
}

vrrp_instance VI_2 {
        state BACKUP
        interface eth0
        virtual_router_id 164
        priority 100
        advert_int 1
        smtp_alert
        native_ipv6

        authentication {
                auth_type AH
                auth_pass xxxxxxxxx
        }

        track_script {
                check_nginx
        }

        unicast_peer {
                xxx:xxxx:120:c1
        }

        virtual_ipaddress {
                xxx.xxx.xxx.xxx/32 dev eth0
                xxxx:xxxx:c1/96 dev eth0
        }

        ! make sure the VIP is a non-preferred outgoing IP
        notify_master "/usr/bin/ip -6 addr change xxxx:xxxx:1f2:c1/96 dev eth0 preferred_lft 0"
}

Error when both tracked interfaces go down & then up

If monitored interface(s) on both servers goes down and then up, keepalived is not bringing up the Virtual IP and just stuck with the error:

Keepalived_vrrp[18056]: cant do IP_DROP_MEMBERSHIP errno=Cannot assign requested address (99)

In order to make it work again, only restarting the service is helping. What can be the issue?

I'm using unicast_peer.
Here is the config I'm using

! secondary
vrrp_instance LB_1 {
    interface eth0
    virtual_router_id 122
    priority 101

    track_interface {
        ! this is private lan interface    
        bond0
    }

    virtual_ipaddress {
        x.x.x.10/27 dev eth0
    }

    unicast_peer {
        10.0.0.122
        10.0.0.123
    }
}

! primary
vrrp_instance LB_2 {
    interface eth0
    virtual_router_id 123
    priority 100

    track_interface {
        ! this is private lan interface
        bond0
    }

    virtual_ipaddress {
        x.x.x.20/27 dev eth0
    }

    unicast_peer {
        10.0.0.122
        10.0.0.123
    }
}

keepalived with VLANs on bond0

Hi,
it looks like keepalived does not work as expected with VLANs on bond0 (bond0.1, bond0.100, etc.).
Is this a known issue?

keepalived ignores ops in its configuration file

I'm using keepalived 1.2.15 on two servers and keepalived doesn't seem to be picking up the fact that the virtual_server stanza includes 'ops'.

Here's the section of the keepalived.conf file:
virtual_server 192.168.0.56 514 {
delay_loop 6
lvs_sched rr
lb_kind DR
ops
...
}

Here's what I'm seeing in syslog when I run 'keepalived -d':

Keepalived_healthcheckers[6083]: System is compiled with LVS v1.2.1
Keepalived_healthcheckers[6083]: VIP = 192.168.0.56, VPORT = 514
Keepalived_healthcheckers[6083]: delay_loop = 6, lb_algo = rr
Keepalived_healthcheckers[6083]: protocol = UDP
Keepalived_healthcheckers[6083]: alpha is OFF, omega is OFF
Keepalived_healthcheckers[6083]: quorum = 1, hysteresis = 0
Keepalived_healthcheckers[6083]: lb_kind = DR

There's no mention of ops and running 'ipvsadm -S' shows that it isn't activated.

FileDescriptor open but never closed

Hi all,

first of all, sorry if that issue report isn't perfect or complete, but I'm not the one who noticed the problem on our server.

It seems that everytime keepalive open the configuration file, a new filedescriptor is allocated but is never released.
In our achitecture, where we try to reload the configuration file, we encounter a too many files open error quite easily, leading to keepalived running but without any configuration.

Hope you can provide a fix rapidly.
Regards

FileDescriptor open but never closed (part 2)

I confirm issue #104.

OS: CentOS release 6.5 (Final)
Name: keepalived
Version: 1.2.13
Release: 5

rpm -qa|grep libnl

libnl-1.1.4-2.el6.x86_64
libnl-devel-1.1.4-2.el6.x86_64

cat /etc/keepalived/keepalived.conf

global_defs {
router_id LVS
}

lsof -p 15152 | grep sock

keepalive 15152 root 3u unix 0xffff880866ed7700 0t0 64357831 socket
keepalive 15152 root 6u sock 0,7 0t0 64412726 can't identify protocol
keepalive 15152 root 8u sock 0,7 0t0 64142259 can't identify protocol
keepalive 15152 root 9u sock 0,7 0t0 64329626 can't identify protocol
keepalive 15152 root 10u sock 0,7 0t0 64376820 can't identify protocol
keepalive 15152 root 11u sock 0,7 0t0 64412679 can't identify protocol
keepalive 15152 root 12u sock 0,7 0t0 64412687 can't identify protocol
keepalive 15152 root 13u sock 0,7 0t0 64412695 can't identify protocol
keepalive 15152 root 14u sock 0,7 0t0 64412703 can't identify protocol
keepalive 15152 root 15u sock 0,7 0t0 64412711 can't identify protocol
keepalive 15152 root 16u sock 0,7 0t0 64412719 can't identify protocol
keepalive 15152 root 17u sock 0,7 0t0 64412727 can't identify protocol

And each reload creates a new one socket.

keepalived 1.2.7 from CentOS repo is ok.

3 node unicast with IPSEC-AH auth_type not work

Hi,

I'm just tryng this simple config, with 3 nodes in unicast and auth IPSEC-AH:

10.x.x.21 --> MASTER
10.x.x.22 --> BACKUP1
10.x.x.23 --> BACKUP2

1 VIP 10.x.x.20

First with Keepalived 1.2.13 and then with 1.2.16 (from debian experimental), I've the same behaviour.

The config file are:

ON NODE1 - vrrp01 - 10.x.x.21

vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
fall 2
rise 2
}
vrrp_instance VI_1 {
debug 4
interface eth0
state MASTER
virtual_router_id 99
priority 101
authentication {
auth_type AH
auth_pass XXXXX
}
unicast_src_ip 10.x.x.21
unicast_peer {
10.x.x.22
10.x.x.23
}
virtual_ipaddress {
10.x.x.20 dev eth0
}
track_script {
chk_haproxy
}
}

ON NODE2 - vrrp02 - 10.x.x.22

vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
fall 2
rise 2
}
vrrp_instance VI_1 {
debug 4
interface eth0
state BACKUP
virtual_router_id 99
priority 100
authentication {
auth_type AH
auth_pass XXXXX
}
unicast_src_ip 10.x.x.22
unicast_peer {
10.x.x.21
10.x.x.23
}
virtual_ipaddress {
10.x.x.20 dev eth0
}
track_script {
chk_haproxy
}
}

ON NODE3 - vrrp03 - 10.x.x.23

vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
fall 2
rise 2
}
vrrp_instance VI_1 {
debug 4
interface eth0
state BACKUP
virtual_router_id 99
priority 50
authentication {
auth_type AH
auth_pass XXXXX
}
unicast_src_ip 10.x.x.23
unicast_peer {
10.x.x.21
10.x.x.22
}
virtual_ipaddress {
10.x.x.20 dev eth0
}
track_script {
chk_haproxy
}
}

tcpdump on NODE1:
10.x.x.21 > 10.x.x.22: AH(spi=0x0a030315,seq=0x184a): VRRPv2, Advertisement, vrid 99, prio 103, authtype ah, intvl 1s, length 20
10.x.x.21 > 10.x.x.23: AH(spi=0x0a030315,seq=0x184b): VRRPv2, Advertisement, vrid 99, prio 103, authtype ah, intvl 1s, length 20

tcpdump on NODE2:
10.x.x.21 > 10.x.x.22: AH(spi=0x0a030315,seq=0x184a): VRRPv2, Advertisement, vrid 99, prio 103, authtype ah, intvl 1s, length 20

tcpdump on NODE3:
10.x.x.21 > 10.x.x.23: AH(spi=0x0a030315,seq=0x184b): VRRPv2, Advertisement, vrid 99, prio 103, authtype ah, intvl 1s, length 20

At startup of 3 nodes, on NODE3 I've this messages every second:

VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
bogus VRRP packet received on eth0 !!!
VRRP_Instance(VI_1) ignoring received advertisment...

Then when test failures of node1 and/or node2 (keepalived stop then start), I've this kind of messages randomly on every nodes:

VRRP_Instance(VI_1) IPSEC-AH : sequence number 1383 already proceeded. Packet dropped. Local(1384)
bogus VRRP packet received on eth0 !!!
VRRP_Instance(VI_1) Dropping received VRRP packet...

and VIP become active on 2 NODES simultaneously.

debug messages are

NODE01:

May 7 20:29:18 vrrp01 Keepalived[14655]: Starting Keepalived v1.2.16 (05/07,2015)
May 7 20:29:18 vrrp01 Keepalived[14656]: Starting Healthcheck child process, pid=14659
May 7 20:29:18 vrrp01 Keepalived[14656]: Starting VRRP child process, pid=14660
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Registering Kernel netlink reflector
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Registering Kernel netlink command channel
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Registering gratuitous ARP shared channel
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Opening file '/etc/keepalived/keepalived.conf'.
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Initializing ipvs 2.6
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Registering Kernel netlink reflector
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Registering Kernel netlink command channel
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Configuration is using : 64385 Bytes
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Opening file '/etc/keepalived/keepalived.conf'.
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: ------< Global definitions >------
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Router ID = vrrp01
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Smtp server connection timeout = 30
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Email notification from = root@vrrp01
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: VRRP IPv4 mcast group = 224.0.0.18
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: VRRP IPv6 mcast group = ff02::12
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: SNMP Trap disabled
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: ------< VRRP Topology >------
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: VRRP Instance = VI_1
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Want State = MASTER
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Runing on device = eth0
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Using src_ip = 10.x.x.21
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Gratuitous ARP repeat = 5
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Gratuitous ARP refresh repeat = 1
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Virtual Router ID = 99
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Priority = 101
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Advert interval = 1sec
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Authentication type = IPSEC_AH
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Password = xxxxxxx
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Tracked scripts = 1
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: chk_haproxy weight 2
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Unicast Peer = 2
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: 10.x.x.22
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: 10.x.x.23
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Virtual IP = 1
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: 10.x.x.20/32 dev eth0 scope global
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: ------< VRRP Scripts >------
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: VRRP Script = chk_haproxy
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Command = killall -0 haproxy
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Interval = 2 sec
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Timeout = 0 sec
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Weight = 2
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Rise = 2
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Fall = 2
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Status = INIT
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: Using LinkWatch kernel netlink reflector...
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Configuration is using : 5164 Bytes
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: ------< Global definitions >------
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Router ID = vrrp01
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Smtp server connection timeout = 30
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Email notification from = root@vrrp01
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: VRRP IPv4 mcast group = 224.0.0.18
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: VRRP IPv6 mcast group = ff02::12
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: SNMP Trap disabled
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: ------< SSL definitions >------
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Using autogen SSL context
May 7 20:29:18 vrrp01 Keepalived_healthcheckers[14659]: Using LinkWatch kernel netlink reflector...
May 7 20:29:18 vrrp01 Keepalived_vrrp[14660]: VRRP_Script(chk_haproxy) succeeded
May 7 20:29:19 vrrp01 Keepalived_vrrp[14660]: VRRP_Instance(VI_1) Transition to MASTER STATE
May 7 20:29:20 vrrp01 Keepalived_vrrp[14660]: VRRP_Instance(VI_1) Entering MASTER STATE

NODE2:

May 7 20:29:18 vrrp02 Keepalived[15187]: Starting Keepalived v1.2.16 (05/07,2015)
May 7 20:29:18 vrrp02 Keepalived[15188]: Starting Healthcheck child process, pid=15191
May 7 20:29:18 vrrp02 Keepalived[15188]: Starting VRRP child process, pid=15192
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Registering Kernel netlink reflector
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Registering Kernel netlink command channel
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Registering gratuitous ARP shared channel
May 7 20:29:18 vrrp02 Keepalived_healthcheckers[15191]: Initializing ipvs 2.6
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Opening file '/etc/keepalived/keepalived.conf'.
May 7 20:29:18 vrrp02 Keepalived_healthcheckers[15191]: Registering Kernel netlink reflector
May 7 20:29:18 vrrp02 Keepalived_healthcheckers[15191]: Registering Kernel netlink command channel
May 7 20:29:18 vrrp02 Keepalived_healthcheckers[15191]: Opening file '/etc/keepalived/keepalived.conf'.
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Configuration is using : 64385 Bytes
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: ------< Global definitions >------
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Router ID = vrrp02
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Smtp server connection timeout = 30
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Email notification from = root@vrrp02
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: VRRP IPv4 mcast group = 224.0.0.18
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: VRRP IPv6 mcast group = ff02::12
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: SNMP Trap disabled
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: ------< VRRP Topology >------
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: VRRP Instance = VI_1
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Want State = BACKUP
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Runing on device = eth0
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Using src_ip = 10.x.x.22
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Gratuitous ARP repeat = 5
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Gratuitous ARP refresh repeat = 1
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Virtual Router ID = 99
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Priority = 100
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Advert interval = 1sec
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Authentication type = IPSEC_AH
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Password = xxxxxxx
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Tracked scripts = 1
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: chk_haproxy weight 2
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Unicast Peer = 2
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: 10.x.x.21
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: 10.x.x.23
May 7 20:29:18 vrrp02 Keepalived_healthcheckers[15191]: Configuration is using : 5164 Bytes
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Virtual IP = 1
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: 10.x.x.20/32 dev eth0 scope global
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: ------< VRRP Scripts >------
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: VRRP Script = chk_haproxy
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Command = killall -0 haproxy
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Interval = 2 sec
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Timeout = 0 sec
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Weight = 2
May 7 20:29:18 vrrp02 Keepalived_vrrp[15192]: Rise = 2
May 7 20:29:19 vrrp02 Keepalived_vrrp[15192]: Fall = 2
May 7 20:29:19 vrrp02 Keepalived_vrrp[15192]: Status = INIT
May 7 20:29:19 vrrp02 Keepalived_vrrp[15192]: Using LinkWatch kernel netlink reflector...
May 7 20:29:19 vrrp02 Keepalived_vrrp[15192]: VRRP_Instance(VI_1) Entering BACKUP STATE
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: ------< Global definitions >------
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: Router ID = vrrp02
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: Smtp server connection timeout = 30
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: Email notification from = root@vrrp02
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: VRRP IPv4 mcast group = 224.0.0.18
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: VRRP IPv6 mcast group = ff02::12
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: SNMP Trap disabled
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: ------< SSL definitions >------
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: Using autogen SSL context
May 7 20:29:19 vrrp02 Keepalived_healthcheckers[15191]: Using LinkWatch kernel netlink reflector...
May 7 20:29:19 vrrp02 Keepalived_vrrp[15192]: VRRP_Script(chk_haproxy) succeeded

NODE3:

May 7 20:29:18 vrrp03 Keepalived[15658]: Starting Keepalived v1.2.16 (05/07,2015)
May 7 20:29:18 vrrp03 Keepalived[15659]: Starting Healthcheck child process, pid=15662
May 7 20:29:18 vrrp03 Keepalived[15659]: Starting VRRP child process, pid=15663
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Registering Kernel netlink reflector
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Registering Kernel netlink command channel
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Registering gratuitous ARP shared channel
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Opening file '/etc/keepalived/keepalived.conf'.
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Initializing ipvs 2.6
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Registering Kernel netlink reflector
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Configuration is using : 64383 Bytes
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Registering Kernel netlink command channel
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Opening file '/etc/keepalived/keepalived.conf'.
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: ------< Global definitions >------
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Router ID = vrrp03
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Smtp server connection timeout = 30
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Email notification from = root@vrrp03
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: VRRP IPv4 mcast group = 224.0.0.18
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: VRRP IPv6 mcast group = ff02::12
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: SNMP Trap disabled
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: ------< VRRP Topology >------
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: VRRP Instance = VI_1
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Want State = BACKUP
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Runing on device = eth0
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Using src_ip = 10.x.x.23
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Gratuitous ARP repeat = 5
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Gratuitous ARP refresh repeat = 1
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Virtual Router ID = 99
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Priority = 50
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Advert interval = 1sec
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Authentication type = IPSEC_AH
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Password = xxxxxxxx
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Tracked scripts = 1
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: chk_haproxy weight 2
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Unicast Peer = 2
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: 10.x.x.21
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: 10.x.x.22
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Virtual IP = 1
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: 10.x.x.20/32 dev eth0 scope global
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: ------< VRRP Scripts >------
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: VRRP Script = chk_haproxy
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Command = killall -0 haproxy
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Interval = 2 sec
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Timeout = 0 sec
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Weight = 2
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Rise = 2
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Fall = 2
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Status = INIT
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: Using LinkWatch kernel netlink reflector...
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) Entering BACKUP STATE
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Configuration is using : 5162 Bytes
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: ------< Global definitions >------
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Router ID = vrrp03
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Smtp server connection timeout = 30
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Email notification from = root@vrrp03
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: VRRP IPv4 mcast group = 224.0.0.18
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: VRRP IPv6 mcast group = ff02::12
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: SNMP Trap disabled
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: ------< SSL definitions >------
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Using autogen SSL context
May 7 20:29:18 vrrp03 Keepalived_healthcheckers[15662]: Using LinkWatch kernel netlink reflector...
May 7 20:29:18 vrrp03 Keepalived_vrrp[15663]: VRRP_Script(chk_haproxy) succeeded
May 7 20:29:19 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:19 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:19 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:20 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:20 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:20 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:21 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:21 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:21 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:22 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:22 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:22 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:23 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:23 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:23 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:24 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:24 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:24 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:25 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:25 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:25 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:26 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:26 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:26 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
May 7 20:29:27 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) IPSEC-AH : invalid IPSEC HMAC-MD5 value. Due to fields mutation or bad password !
May 7 20:29:27 vrrp03 Keepalived_vrrp[15663]: bogus VRRP packet received on eth0 !!!
May 7 20:29:27 vrrp03 Keepalived_vrrp[15663]: VRRP_Instance(VI_1) ignoring received advertisment...
.
.
.
.

Any hints ?
thanks

VRRP crashes on reload when using virtual_routes

If you have the following keepalived.conf:

vrrp_instance eth0 {
    interface eth0
    virtual_router_id 10
    virtual_ipaddress {
       192.168.1.1/30
    }
    virtual_routes {
        192.168.2.0/24 via 192.168.1.2 dev eth0
    }
}

...and then reload keepalived with SIGHUP, the VRRP child process crashes with the following log messages showing up. (This test server was already in the MASTER state and there were no other VRRP speakers on eth0.)

Keepalived_healthcheckers[1740]: Initializing ipvs 2.6
Keepalived_healthcheckers[1740]: IPVS: Can't initialize ipvs: Protocol not available
Keepalived_healthcheckers[1740]: Registering Kernel netlink reflector
Keepalived_healthcheckers[1740]: Registering Kernel netlink command channel
Keepalived_healthcheckers[1740]: Opening file '/etc/keepalived/keepalived.conf'.
Keepalived_healthcheckers[1740]: Configuration is using : 3157 Bytes
Keepalived_healthcheckers[1740]: Using LinkWatch kernel netlink reflector...
Keepalived_vrrp[1741]: Registering Kernel netlink reflector
Keepalived_vrrp[1741]: Registering Kernel netlink command channel
Keepalived_vrrp[1741]: Registering gratuitous ARP shared channel
Keepalived_vrrp[1741]: Initializing ipvs 2.6
Keepalived_vrrp[1741]: IPVS: Can't initialize ipvs: Protocol not available
Keepalived_vrrp[1741]: Opening file '/etc/keepalived/keepalived.conf'.
kernel: [ 2650.704536] keepalived[1741]: segfault at 0 ip 0000000000410972 sp 00007fff34790dc8 error 4 in keepalived[400000+36000]
Keepalived[1738]: VRRP child process(1741) died: Respawning
Keepalived[1738]: Starting VRRP child process, pid=1787
Keepalived_vrrp[1787]: Registering Kernel netlink reflector
Keepalived_vrrp[1787]: Registering Kernel netlink command channel
Keepalived_vrrp[1787]: Registering gratuitous ARP shared channel
Keepalived_vrrp[1787]: Initializing ipvs 2.6
Keepalived_vrrp[1787]: IPVS: Can't initialize ipvs: Protocol not available
Keepalived_vrrp[1787]: Opening file '/etc/keepalived/keepalived.conf'.
Keepalived_vrrp[1787]: Configuration is using : 60250 Bytes
Keepalived_vrrp[1787]: Using LinkWatch kernel netlink reflector...
Keepalived_vrrp[1787]: VRRP_Instance(eth0) Entering BACKUP STATE
Keepalived_vrrp[1787]: VRRP_Instance(eth0) Transition to MASTER STATE
Keepalived_vrrp[1787]: VRRP_Instance(eth0) Entering MASTER STATE

Note the temporary transition to BACKUP state, which lasts for about five seconds. In this period, the address 192.168.1.1/30 and route to 192.168.2.0/24 are not removed from eth0 - it would appear that the crash makes keepalived forget that it had added them in the first place. This makes this a bug service-impacting one for us; those five seconds are sufficient for another VRRP speaker on the link to transition to MASTER state, while the keepalived process that was reloaded and crashed remains in the BACKUP state, but all the addresses and routes linger - creating an undesired active/active "split-brain" situation, leading to ARP flip-flopping, asymmetric routing, and so on.

git bisect identifies the bug as having been introduced by commit 494bd96:

494bd96adcc6982b8de387ebad1308c01ed097ab is the first bad commit
commit 494bd96adcc6982b8de387ebad1308c01ed097ab
Author: Alexandre Cassen <[email protected]>
Date:   Tue Sep 3 14:38:54 2013 +0200

IPv6 support for virtual_routes and static_routes

gdb says (note this is from a different build than the syslog messages above):

Program received signal SIGSEGV, Segmentation fault.
0x000000000041610a in route_exist (l=0x182f110, iproute=0x182f390) at vrrp_iproute.c:285
285                     if (ROUTE_ISEQ(ipr, iproute)) {
(gdb) bt full
#0  0x000000000041610a in route_exist (l=0x182f110, iproute=0x182f390) at vrrp_iproute.c:285
        ipr = 0x182e460
        e = 0x182e530
#1  0x00000000004163fc in clear_diff_routes (l=0x182f360, n=0x182f110) at vrrp_iproute.c:314
        iproute = 0x182f390
        tmp_str = 0x182e220 "\002"
        e = 0x182f460
#2  0x000000000041d39a in clear_diff_vrrp_vroutes (old_vrrp=0x182e7f0) at vrrp.c:1319
        vrrp = 0x182e220
#3  0x000000000041d69b in clear_diff_vrrp () at vrrp.c:1395
        new_vrrp = 0x182e220
        e = 0x182dba0
        l = 0x18207b0
        vrrp = 0x182e7f0
#4  0x00000000004183a4 in start_vrrp () at vrrp_daemon.c:137
No locals.
#5  0x00000000004185c8 in reload_vrrp_thread (thread=0x7fff5c58bcc0) at vrrp_daemon.c:232
No locals.
#6  0x00000000004271f4 in thread_call (thread=0x7fff5c58bcc0) at scheduler.c:755
No locals.
#7  0x0000000000427225 in launch_scheduler () at scheduler.c:778
        thread = {id = 52, type = 3 '\003', next = 0x0, prev = 0x0, master = 0x181d010, func = 0x418515 <reload_vrrp_thread>,
          arg = 0x0, sands = {tv_sec = 0, tv_usec = 0}, u = {val = 0, fd = 0, c = {pid = 0, status = 0}}}
#8  0x0000000000418817 in start_vrrp_child () at vrrp_daemon.c:335
        pid = 0
        ret = 0
#9  0x0000000000403c59 in start_keepalived () at main.c:85
No locals.
#10 0x0000000000404348 in main (argc=1, argv=0x7fff5c58be48) at main.c:303
No locals.

This is on Ubuntu 12.04.4 LTS, x86_64, with kernel 3.2.0-59-generic.

strange behavior when changing the network mask of the main IP address

Hi

I'm facing a strange behavior of keepalived and I'm wondering if it is a bug or not...

I have the following configuration :

  • keepalived v1.2.12 under linux debian 6.0.8.
  • eth0:vip0_0 running on my box, with IP 192.168.150.34/24
  • eth0 has IP address 192.168.150.32/24

If I change the netmask of eth0 with ifconfig while keepalived is up and serving eth0:vip0_0, and I stop keepalived like that :
start-stop-daemon --oknodo --stop --quiet --pidfile /var/run/keepalived.pid --exec /usr/sbin/keepalived

Then eth0 loses its main ip address (192.168.150.32 in my case)...

So I'm wondering if it could be a bug of keepalived ? or if I'm doing something wrong (I can send your the keepalived.conf if you wish).

Thank you in advance !
Best regards,
Julien

keepalived 1.2.17 calls notify script twice when it starts in MASTER state

Here is my configuration file:

vrrp_instance pcr-ny4-mktdata-relay {
state MASTER
interface lan0.3003
virtual_router_id 99
priority 200
advert_int 1
authentication {
auth_type PASS
auth_pass kaTest
}
virtual_ipaddress {
192.168.53.250/24 brd 192.168.53.255 dev lan0.3003
}
notify {
/tmp/katest.sh
}
}

bash-4.3$ cat /tmp/katest.sh

!/bin/sh

echo "date Called with args: $@" >> /tmp/katest.log

After startup, I see:

bash-4.3$ cat /tmp/katest.log
Thu Jun 18 15:05:57 EDT 2015 Called with args: INSTANCE pcr-ny4-mktdata-relay MASTER 200
Thu Jun 18 15:05:57 EDT 2015 Called with args: INSTANCE pcr-ny4-mktdata-relay MASTER 200

I am running it like so:

bash$ keepalived -f /extra_disk/tmp/katest/etc/keepalived/keepalived.conf --vrrp -l -D -n -d
Starting VRRP child process, pid=1112
Netlink reflector reports IP 192.168.50.139 added
Netlink reflector reports IP 10.25.3.133 added
Netlink reflector reports IP 192.168.53.139 added
Netlink reflector reports IP 192.168.53.250 added
Netlink reflector reports IP 192.168.55.139 added
Netlink reflector reports IP 192.168.61.139 added
Netlink reflector reports IP 10.211.0.5 added
Netlink reflector reports IP 192.168.63.139 added
Netlink reflector reports IP 192.168.62.139 added
Netlink reflector reports IP 192.168.60.139 added
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/extra_disk/tmp/katest/etc/keepalived/keepalived.conf'.
Configuration is using : 65479 Bytes
------< Global definitions >------
Router ID = ti139
VRRP IPv4 mcast group = 224.0.0.18
VRRP IPv6 mcast group = ff02::12
------< VRRP Topology >------
VRRP Instance = pcr-ny4-mktdata-relay
Using VRRPv2
Want State = MASTER
Runing on device = lan0.3003
Gratuitous ARP repeat = 5
Gratuitous ARP refresh repeat = 1
Virtual Router ID = 99
Priority = 200
Advert interval = 1 sec

Accept disabled
Authentication type = SIMPLE_PASSWORD
Password = kaTest
Virtual IP = 1
192.168.53.250/24 brd 192.168.53.255 dev lan0.3003 scope global
Generic state transition scripts = 1

  /tmp/katest.sh

Using LinkWatch kernel netlink reflector...
Opening script file /tmp/katest.sh
VRRP sockpool: [ifindex(5), proto(112), unicast(0), fd(10,11)]
VRRP_Instance(pcr-ny4-mktdata-relay) Transition to MASTER STATE
VRRP_Instance(pcr-ny4-mktdata-relay) Entering MASTER STATE
VRRP_Instance(pcr-ny4-mktdata-relay) setting protocol VIPs.
VRRP_Instance(pcr-ny4-mktdata-relay) Sending gratuitous ARPs on lan0.3003 for 192.168.53.250
Opening script file /tmp/katest.sh
VRRP_Instance(pcr-ny4-mktdata-relay) Sending gratuitous ARPs on lan0.3003 for 192.168.53.250
^CStopping Keepalived v1.2.17 (06/18,2015)

How can implemented Keepalived in java application?

Hi,

I having a 5 servers.When i am running my application I need to ping those 5 servers for checking status of the server(i.e., Active or Passive). If node is Passive then send an email to particular admin. How can implemented Keepalived in my application.

In keepalived configuration file you are given manually IPaddress and port . In my scenario that ipaddress keep on changing.(servers are increasing not a exact nodes in future more than 100 servers).

Plz reply me some possive direction.
Thanks in advance,
Sekhar.

Unicast across different subnets

Hi,

I'm trying to set up keepalived with unicast on Amazon VPC, and within a subnet, it works like a charm. To be a bit more fault tolerant, I've set up multiple availability zones, which means that I need to place the machines in separate subnets. Now it doesn't work anymore.

The networks I have are 10.0.0.0/24 and 10.0.1.0/24 and traffic flows just fine between them. TCP, UDP, ICMP and VRRP packets are being routed.

When keepalived sends its advert, tcpdump shows a ARP who-has request, which doesn't resolve as the IP being asked for is on a separate subnet. As far as I can tell, the routing tables I have, indicate that the packets should be routed through the gateway, but it doesn't seem to be happening when the packet originates from keepalived.

Am I misunderstanding the unicast support? Shouldn't it work across networks?

cheers,
Guðmundur Bjarni

keepalived 1.2.13 ignores nopreempt

Looks like keepalived 1.2.13 ignores "nopreempt" option:

keepalived.conf on host1:

...

vrrp_instance EXT {
        interface bond0.312
        virtual_router_id 30
        priority 100
        virtual_ipaddress {
              ...
        }
        lvs_sync_daemon_interface bond0.310
        nopreempt
        state BACKUP
}

...

keepalived.conf on host2:

...

vrrp_instance EXT {
        interface bond0.312
        virtual_router_id 30
        priority 90
        virtual_ipaddress {
              ...
        }
        lvs_sync_daemon_interface bond0.310
        nopreempt
        state BACKUP
}

...

After restarting keepalived on host1 it is becoming master again:

...
Keepalived_vrrp[3116]: VRRP_Instance(EXT) Received lower prio advert, forcing new election
...
Keepalived_vrrp[3116]: VRRP_Instance(EXT) Entering MASTER STATE
...

This issue does not happen on 1.2.12

Missing To: Header in notification mail

Notification mails don't include a To: header, which makes it harder in some mail systems to sort mails in the right mailbox. There is also an earlier open Debian bug I will copy&paste here (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=627169)

Package: keepalived
Version: 1:1.1.20-1
Severity: wishlist

Please add a "To:"-Header to SMTP_HEADERS_CMD. At the moment the
following headers are added (defined in keepalived/include/smtp.h):

define SMTP_HEADERS_CMD "Date: %s\r\nFrom: %s\r\nSubject: %s\r\n" \

       "X-Mailer: Keepalived\r\n\r\n"

A valid To-Header would make it easier for certain
mailserver/procmail-configurations to sort the mails into the correct
mailbox.

Thx in advance!

Kind regards,
Mit freundlichen Grüßen
Moritz Schüpp

Master not noticing the BACKUP becoming MASTER for a short while

Hi!

I guess we have some network problem with multicast... but still... I wonder if this is supposed to happen.

Scenario: I guess something is happening so the BACKUP miss some vrrp-packets. So it become MASTER and according to the log it makes a garp immediatly AND one after garp_master_delay.

Then it seems to get the vrrp-packet, so it realize it should be BACKUP again.

On the MASTER node... NOTHING is noticed so no new garp... and stuff stops working.

We have tried unicast instead, but other problems with ipv6... and we want it to work with multicast.

I tried to change garp_master_delay to accept some extra seconds network problem, but since it seems to send a garp immediatly also... that options seems to be rather useless to my understanding.

I guess I could make a notify_backup script to ssh to the other node and restart keepalived, but it seems insanely wrong and primitive... but how should the MASTER otherwise realize it is time for a new garp?

I can simulate the situation with iptables. So it is rather easy to test.

Thanks for any input!

/Erik S, Uppsala University

Keepalived sync daemon

I've enabled lvs_sync_daemon_interface option however it looks like sync daemon is not working

Active node:

[MASTER:~]# ipvsadm -Lnc
IPVS connection entries
pro expire state       source             virtual            destination
TCP 00:37  SYN_RECV    192.168.1.20:55746 192.168.1.10:80    192.168.1.15:80

Passive node:

[BACKUP:~]# ipvsadm -Lnc
IPVS connection entries
pro expire state       source             virtual            destination

Config:

! Configuration File for keepalived

global_defs {
    lvs_id lb1 #lb2 on backup server
}

vrrp_sync_group VG1 {
        group {
            LB1
        }
}

vrrp_instance LB1 {
    state BACKUP #BACKUP on backup server
    interface eth0
    lvs_sync_daemon_interface eth0
    virtual_router_id 50
    priority 2 #1 on backup server
    nopreempt
    advert_int 1
    smtp_alert
        authentication {
            auth_type PASS
            auth_pass $PASSWORD
    }
        virtual_ipaddress {
            192.168.1.10
        }
}

virtual_server 192.168.1.10 80
    delay_loop 5
    lb_algo wlc
    lb_kind DR
    protocol TCP
    ha_suspend
    sorry_server 127.0.0.1

        real_server 192.168.1.15 80 {
            weight 1
            inhibit_on_failure
                HTTP_GET {
                    url {
                 path /
                 status_code 200
                    }
                    connect_timeout 5
                    nb_get_retry 3
                    delay_before_retry 1
                    connect_port 80
                fwmark 80
                }
        }

        real_server 192.168.1.16 80 {
            weight 1
            inhibit_on_failure
                HTTP_GET {
                    url {
                         path /
                         status_code 200
                        }
                    connect_timeout 5
                    nb_get_retry 3
                    delay_before_retry 1
                    connect_port 80
                fwmark 80
                }
        }
}

It seems that sync daemon is running:

[MASTER]# ps -elf | grep [i]pvs
1 S root     11207     2  0  80   0 -     0 sync_t 08:41 ?        00:00:00 [ipvs_syncmaster]

[BACKUP]# ps -elf | grep [i]pvs
1 S root      6231     2  0  80   0 -     0 sync_t 08:41 ?        00:00:00 [ipvs_syncbackup]

System: CentOS 6.5, Keepalived v1.2.7 (02/21,2013)

keepalived 1.2.17 does not call vrrp_instance notify script

I upgraded my Fedora 21 system from keepalived 1.2.16 to 1.2.17, and the notify script is no longer called at startup nor when a state transition occurs. I am running "keepalived --vrrp -D" and a sample config file looks like this:

vrrp_instance firewall {
state BACKUP
interface lan0.3009
virtual_router_id 66
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass pw32wd
}
virtual_ipaddress {
192.168.1.1/28 brd 192.168.1.63 dev lan0.3004
}
notify /etc/keepalived_notify.sh
}

--vrrp still requires ip_vs to be loaded

I want to run keepalived to only manage failover IPs (I have no real servers defined in the configuration file, just virtual IPs), but even though I specify --vrrp ("Only run with VRRP subsystem"), it still tries to do ipvs specific stuff, like load the ip_vs kernel module. I notice from going through the code that ipvs_start is called when initializing the vrrp subsystem and the check subsystem. Is ip_vs kernel module functionality required for just the vrrp subsystem?

In my environment, I'm running keepalived with low privileges but with specific CAP_NET capabilities on the binary. It fails (and goes into an infinite loop respawning the vrrp subsystem process) when trying to spawn modprobe to load modules if they are not loaded. I can work around this, by having this module loaded beforehand, but the way it is presented in the docs is if --vrrp is specified, it seems like nothing with ipvs would be done; that is, --vrrp would allow me to run keepalived on a kernel without IPVS support at all. That doesn't seem to be the case.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.