Code Monkey home page Code Monkey logo

sonic-linkmgrd's Introduction

Total alerts

Project

This repo tracks the source code of linkmgrd for dual-ToR topology.

Coding style guide

Please follow the simple rules laid out below (will update as needed):

Spacing

  • Please configure your editor to enable expand white spaces. Do not use tab for spacing in source code.

  • Please configure your editor to set tab stop at 4 spaces. And use 4 white spaces for each indentation level.

  • Please leave one white space after following keywords:

    if (...)

    while (...)

    for (...)

Curly braces

  • Function definition: start a new line
void function(...)
{
    // code block
}
  • if/for/while condition when condition is one liner: same line
if (...) {
    // code block
}
  • if/for/while condition expands multiple lines: start a new line with ending ')' and '{'
if (...
    ...
) {
    // code block
}

Indentations

  • Please use 4 white spaces for each indentation level.

  • Multiple line condition indentation for (if/while/for) and/or function calls: indent one level and the ')' indent at the same level of the keyword or function name

if (condition line 1 ..
    condition line 2 (indent to '(' above)
    condition line 3
) {
    // code block
}

foo(
    parameter1,
    parameter2,
    parameter3
);

sonic-linkmgrd's People

Contributors

gregshpit avatar lguohan avatar liuh-80 avatar liushilongbuaa avatar lolyu avatar maipbui avatar microsoft-github-operations[bot] avatar pterosaur avatar saiarcot895 avatar tahmed-dev avatar yxieca avatar zjswhhh avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sonic-linkmgrd's Issues

[active-standby] Port could not change back to standby if config active then auto in link down

As the title:

# config interface shutdown Ethernet80
# show mux s | grep Ethernet80
Ethernet80  standby   standby          unhealthy  consistent  2023-Feb-21 12:22:41.162578
# config mux mode active Ethernet80
port        state
----------  ----------
Ethernet80  INPROGRESS
# show mux s | grep Ethernet80
Ethernet80  active    active           unhealthy  consistent  2023-Feb-21 12:22:57.268276
# config mux mode auto Ethernet80
port        state
----------  -------
Ethernet80  OK
# show mux s | grep Ethernet80
Ethernet80  active    active           unhealthy  consistent  2023-Feb-21 12:22:57.268276
# show mux s | grep Ethernet80
Ethernet80  active    active           unhealthy  consistent  2023-Feb-21 12:22:57.268276

`linkmgrd` fails to change back to `active` when gRPC connection is lost

How to reproduce:

  1. verify Ethernet4 is in active state initially, and disable the gRPC server on NiC.
# show mux s | grep -w Ethernet4
Ethernet4   active    active           healthy   consistent  2023-Feb-08 12:50:21.305147
  1. config mux mode to standby
# config mux mode standby Ethernet4
port       state
---------  ----------
Ethernet4  INPROGRESS
# show mux s | grep -w Ethernet4
Ethernet4   standby   unknown          healthy   inconsistent
  1. config mux mode back to auto
# config mux mode auto Ethernet4
port       state
---------  -------
Ethernet4  OK
  1. the status is still standby locally, and the tunnel routes are not removed
# show mux s | grep -w Ethernet4
Ethernet4   standby   unknown          healthy   inconsistent
# show mux tun
PORT        DEST_TYPE    DEST_ADDRESS       kernel    asic
----------  -----------  -----------------  --------  ------
Ethernet4   server_ipv4  192.168.0.2/32     added     added
Ethernet4   server_ipv6  fc02:1000::2/128   added     added
Ethernet4   soc_ipv4     192.168.0.3/32     -         added

[Flaky test] `LinkmgrdBootupSequenceHeartBeatFirst` & `LinkmgrdBootupSequenceMuxConfigActiveProbeActive`

[ RUN      ] LinkManagerStateMachineActiveActiveTest.LinkmgrdBootupSequenceHeartBeatFirst
test/LinkManagerStateMachineActiveActiveTest.cpp:467: Failure
Expected equality of these values:
  std::get<1>(mTestCompositeState)
    Which is: 2
  mux_state::MuxState::Label::Active
    Which is: 0
test/LinkManagerStateMachineActiveActiveTest.cpp:468: Failure
Expected equality of these values:
  mDbInterfacePtr->mSetMuxStateInvokeCount
    Which is: 1
  2
[  FAILED  ] LinkManagerStateMachineActiveActiveTest.LinkmgrdBootupSequenceHeartBeatFirst (123 ms)

[ RUN      ] LinkManagerStateMachineActiveActiveTest.LinkmgrdBootupSequenceMuxConfigActiveProbeActive
test/LinkManagerStateMachineActiveActiveTest.cpp:529: Failure
Expected equality of these values:
  std::get<1>(mTestCompositeState)
    Which is: 2
  mux_state::MuxState::Label::Active
    Which is: 0
test/LinkManagerStateMachineActiveActiveTest.cpp:530: Failure
Expected equality of these values:
  mDbInterfacePtr->mSetMuxStateInvokeCount
    Which is: 2
  3
test/LinkManagerStateMachineActiveActiveTest.cpp:534: Failure
Expected equality of these values:
  std::get<1>(mTestCompositeState)
    Which is: 2
  mux_state::MuxState::Label::Active
    Which is: 0
[  FAILED  ] LinkManagerStateMachineActiveActiveTest.LinkmgrdBootupSequenceMuxConfigActiveProbeActive (64 ms)

flaky unit test `MuxStandbyConfigActivegRPCError` on master branch

12:03:07 [ RUN ] LinkManagerStateMachineActiveActiveTest.MuxStandbyConfigActivegRPCError
12:03:07 test/LinkManagerStateMachineActiveActiveTest.cpp:1004: Failure
12:03:07 Expected equality of these values:
12:03:07 std::get<0>(mTestCompositeState)
12:03:07 Which is: 2
12:03:07 link_prober::LinkProberState::Label::Wait
12:03:07 Which is: 3
12:03:07 [ FAILED ] LinkManagerStateMachineActiveActiveTest.MuxStandbyConfigActivegRPCError (987 ms)

`show mux status` returns unhealthy after `config load_minigraph`

After config load_minigraph, show mux status returns unhealthy with both STATUS and SERVER_STATUS as active:

admin@svcstr-7050-acs-1:~$ show mux s
PORT        STATUS    SERVER_STATUS    HEALTH     HWSTATUS    LAST_SWITCHOVER_TIME
----------  --------  ---------------  ---------  ----------  ---------------------------
Ethernet4   active    active           unhealthy  consistent
Ethernet8   active    active           unhealthy  consistent
Ethernet12  active    active           unhealthy  consistent
Ethernet16  active    active           unhealthy  consistent
Ethernet20  active    active           unhealthy  consistent
Ethernet24  active    active           unhealthy  consistent
Ethernet28  active    active           unhealthy  consistent
Ethernet32  active    active           unhealthy  consistent
Ethernet36  active    active           unhealthy  consistent
Ethernet40  active    active           unhealthy  consistent
Ethernet44  active    active           unhealthy  consistent
Ethernet48  active    active           unhealthy  consistent
Ethernet52  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.869109
Ethernet56  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.914692
Ethernet60  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.407092
Ethernet64  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.574557
Ethernet68  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.814924
Ethernet72  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.472237
Ethernet76  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.755438
Ethernet80  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.962116
Ethernet84  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.529300
Ethernet88  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.714808
Ethernet92  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.620240
Ethernet96  active    active           unhealthy  consistent  2022-Aug-15 05:49:40.664041

the HEALTH column should be healthy.

active-active cable not handled by DBIinterface

Even though the cable is configured as "active-active" type it still is probing the i2c bus.

May 27 22:13:57.311413 svcstr-7050-acs-1 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:225 handleMuxConfigNotification: Ethernet8: (P: Unknown, M: Standby, L: Up) -> (P: Unknown, M: Standby, L: Up)
May 27 22:13:57.319794 svcstr-7050-acs-1 WARNING mux#linkmgrd: MuxManager.cpp:185 updateMuxPortConfig: Ethernet4: Mux port config: standby
May 27 22:13:57.320047 svcstr-7050-acs-1 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:225 handleMuxConfigNotification: Ethernet4: (P: Unknown, M: Standby, L: Up) -> (P: Unknown, M: Standby, L: Up)
May 27 22:13:57.320636 svcstr-7050-acs-1 WARNING mux#linkmgrd: DbInterface.cpp:366 handleProbeMuxState: Ethernet4: trigger xcvrd to read Mux State using i2c.

[active-active] no icmp probes with default route present

issue description

It is observed that with the default route present, linkmgrd doesn't send icmp probes.

/var/log/syslog.34.gz:Mar 23 19:03:33.659448 WARNING write_standby: Applying state to interfaces {'Ethernet108': 'active', 'Ethernet112': 'active', 'Ethernet116': 'active', 'Ethernet12': 'active', 'Ethernet120': 'active', 'Ethernet16': 'active', 'Ethernet4': 'active', 'Ethernet44': 'active', 'Ethernet48': 'active', 'Ethernet52': 'active', 'Ethernet56': 'active', 'Ethernet68': 'active', 'Ethernet72': 'active', 'Ethernet76': 'active', 'Ethernet8': 'active', 'Ethernet80': 'active'}
/var/log/syslog.34.gz:Mar 23 19:03:34.074570 INFO lldp#lldpd[22]: MSAP has changed for port Ethernet76, sending a shutdown LLDPDU
/var/log/syslog.34.gz:Mar 23 19:03:34.074917 INFO lldp#supervisord: lldpd 2023-03-23T19:03:34 [INFO/lldp] MSAP has changed for port Ethernet76, sending a shutdown LLDPDU
/var/log/syslog.34.gz:Mar 23 19:03:34.270684 NOTICE swss#orchagent: :- nbrHandler: Processing neighbors for mux Ethernet76, enable 1, state 1
/var/log/syslog.34.gz:Mar 23 19:03:34.270684 NOTICE swss#orchagent: :- setState: [Ethernet76] Set MUX state from standby to active
/var/log/syslog.34.gz:Mar 23 19:03:34.273156 NOTICE swss#orchagent: :- updateNeighbor: Processing update on neighbor 10.50.147.28 for mux Ethernet76, add 1, state 1
/var/log/syslog.34.gz:Mar 23 19:03:34.287891 NOTICE swss#orchagent: :- updateNeighbor: Processing update on neighbor 2603:10b0:d11:8618::a32:931c for mux Ethernet76, add 1, state 1
/var/log/syslog.34.gz:Mar 23 19:03:34.306135 NOTICE swss#orchagent: :- addOperation: Mux State set to active for port Ethernet76
/var/log/syslog.34.gz:Mar 23 19:03:34.675441 INFO caclmgrd[1315]: dhcp packet mark update : '('Ethernet76', 'SET', (('mark', '0x67014'),))'
/var/log/syslog.34.gz:Mar 23 19:03:34.687475 INFO caclmgrd[6577]: DROP  all opt -- in * out *  0.0.0.0/0  -> 0.0.0.0/0   PHYSDEV match --physdev-in Ethernet76
/var/log/syslog.34.gz:Mar 23 19:03:34.699444 INFO caclmgrd[1315]: Update DHCP chain: iptables --delete DHCP -m physdev --physdev-in Ethernet76 -j DROP
/var/log/syslog.34.gz:Mar 23 19:03:35.073452 NOTICE swss#orchagent: :- addOperation: Mux setting State DB entry (hw state active, mux state active) for port Ethernet76
/var/log/syslog.34.gz:Mar 23 19:03:35.081328 INFO caclmgrd[1315]: mux cable update : '('Ethernet76', 'SET', (('state', 'active'),))'
/var/log/syslog.34.gz:Mar 23 19:03:40.752552 WARNING mux#linkmgrd: MuxManager.cpp:222 updatePortCableType: Ethernet76: Port cable type: active-active
/var/log/syslog.34.gz:Mar 23 19:03:40.761061 WARNING mux#linkmgrd: MuxManager.cpp:162 addOrUpdateMuxPort: Ethernet76: server IP: 10.50.147.28
/var/log/syslog.34.gz:Mar 23 19:03:40.765750 WARNING mux#linkmgrd: MuxManager.cpp:185 addOrUpdateMuxPortSoCAddress: Ethernet76: SoC IP: 10.50.147.29
/var/log/syslog.34.gz:Mar 23 19:03:40.777676 WARNING mux#linkmgrd: MuxPort.cpp:434 handleTsaEnable: port: Ethernet76, configuring mux mode due to tsa_enable notification from CONFIG DB. 
/var/log/syslog.34.gz:Mar 23 19:03:40.779564 WARNING mux#linkmgrd: MuxManager.cpp:207 updateMuxPortConfig: Ethernet76: Mux port config: auto
/var/log/syslog.34.gz:Mar 23 19:03:40.787474 WARNING mux#linkmgrd: MuxManager.cpp:262 addOrUpdateMuxPortLinkState: Ethernet76: link state: up
/var/log/syslog.34.gz:Mar 23 19:03:40.788014 WARNING mux#linkmgrd: MuxManager.cpp:288 addOrUpdateMuxPortMuxState: Ethernet76: state db mux state: active
/var/log/syslog.34.gz:Mar 23 19:03:40.788619 WARNING mux#linkmgrd: MuxPort.cpp:365 handleDefaultRouteState: port: Ethernet76, state db default route state: na
/var/log/syslog.34.gz:Mar 23 19:03:40.789398 WARNING mux#linkmgrd: MuxPort.cpp:365 handleDefaultRouteState: port: Ethernet76, state db default route state: ok
/var/log/syslog.34.gz:Mar 23 19:03:40.790520 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:176 handleMuxStateNotification: Ethernet76: state db mux state: Active
/var/log/syslog.34.gz:Mar 23 19:03:40.791360 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:77 activateStateMachine: Ethernet76: MUX port link prober initialized with server IP: 10.50.147.29, server MAC: 04:27:28:7a:00:4c
/var/log/syslog.34.gz:Mar 23 19:03:40.791485 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:86 activateStateMachine: Ethernet76: (P: Wait, M: Active, L: Up) -> (P: Active, M: Active, L: Up)
/var/log/syslog.34.gz:Mar 23 19:03:40.807267 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:917 setLabel: Ethernet76: Linkmgrd state is: Active Unhealthy
/var/log/syslog.34.gz:Mar 23 19:03:40.807868 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:917 setLabel: Ethernet76: Linkmgrd state is: Active Healthy
/var/log/syslog.34.gz:Mar 23 19:03:40.807923 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:336 handlePeerMuxStateNotification: Ethernet76: server side peer forwarding state : Active
/var/log/syslog.34.gz:Mar 23 19:03:40.872978 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:1010 switchMuxState: Ethernet76: Switching MUX state to 'Standby'
/var/log/syslog.34.gz:Mar 23 19:03:40.873073 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:1275 handleDefaultRouteStateNotification: Ethernet76: (P: Active, M: Active, L: Up) -> (P: Active, M: Standby, L: Up)
/var/log/syslog.34.gz:Mar 23 19:03:40.873165 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:917 setLabel: Ethernet76: Linkmgrd state is: Standby Unhealthy
/var/log/syslog.34.gz:Mar 23 19:03:40.873258 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:277 handleProbeMuxStateNotification: Ethernet76: Received unsolicited MUX state probe notification!
/var/log/syslog.34.gz:Mar 23 19:03:40.873803 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:581 handlePeerStateChange: Ethernet76: Received peer link prober event, new state: PeerActive
/var/log/syslog.34.gz:Mar 23 19:03:40.873953 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:1010 switchMuxState: Ethernet76: Switching MUX state to 'Active'
/var/log/syslog.34.gz:Mar 23 19:03:40.873953 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:520 handleStateChange: Ethernet76: Received mux state event, new state: Active
/var/log/syslog.34.gz:Mar 23 19:03:41.107892 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:482 handleStateChange: Ethernet76: Received link prober event, new state: Unknown
/var/log/syslog.34.gz:Mar 23 19:03:41.108072 WARNING mux#linkmgrd: DbInterface.cpp:259 postLinkProberMetricsEvent: Ethernet76: posting link prober event link_prober_unknown_start
/var/log/syslog.34.gz:Mar 23 19:03:41.108280 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:1010 switchMuxState: Ethernet76: Switching MUX state to 'Standby'
/var/log/syslog.34.gz:Mar 23 19:03:41.108458 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:502 handleStateChange: Ethernet76: (P: Active, M: Active, L: Up) -> (P: Unknown, M: Standby, L: Up)
/var/log/syslog.34.gz:Mar 23 19:03:41.110323 WARNING mux#linkmgrd: DbInterface.cpp:511 handlePostLinkProberMetrics: Ethernet76: posting link prober event link_prober_unknown_start
/var/log/syslog.34.gz:Mar 23 19:03:41.158618 NOTICE swss#orchagent: :- setState: [Ethernet76] Set MUX state from active to active
/var/log/syslog.34.gz:Mar 23 19:03:41.158957 NOTICE swss#orchagent: :- setState: [Ethernet76] Maintaining current MUX state
/var/log/syslog.34.gz:Mar 23 19:03:41.159113 NOTICE swss#orchagent: :- addOperation: Mux State set to active for port Ethernet76
/var/log/syslog.34.gz:Mar 23 19:03:41.251892 NOTICE swss#orchagent: :- setState: [Ethernet76] Set MUX state from active to standby
/var/log/syslog.34.gz:Mar 23 19:03:41.251940 NOTICE swss#orchagent: :- nbrHandler: Processing neighbors for mux Ethernet76, enable 0, state 2
/var/log/syslog.34.gz:Mar 23 19:03:41.282362 NOTICE swss#orchagent: :- addOperation: Mux State set to standby for port Ethernet76
/var/log/syslog.34.gz:Mar 23 19:03:41.407679 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:1207 handleMuxWaitTimeout: Ethernet76: orchagent timed out responding to linkmgrd, current state: (P: Unknown, M: Standby, L: Up)
/var/log/syslog.34.gz:Mar 23 19:03:42.355681 NOTICE swss#orchagent: :- addOperation: Mux setting State DB entry (hw state standby, mux state standby) for port Ethernet76
/var/log/syslog.34.gz:Mar 23 19:03:42.356618 WARNING mux#linkmgrd: MuxManager.cpp:288 addOrUpdateMuxPortMuxState: Ethernet76: state db mux state: standby
/var/log/syslog.34.gz:Mar 23 19:03:42.356863 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:176 handleMuxStateNotification: Ethernet76: state db mux state: Standby
/var/log/syslog.34.gz:Mar 23 19:03:42.408051 INFO caclmgrd[1315]: mux cable update : '('Ethernet76', 'SET', (('state', 'standby'),))'
/var/log/syslog.34.gz:Mar 23 19:03:42.929087 NOTICE swss#orchagent: :- addOperation: Mux setting State DB entry (hw state standby, mux state standby) for port Ethernet76
/var/log/syslog.34.gz:Mar 23 19:03:42.929087 WARNING mux#linkmgrd: MuxManager.cpp:288 addOrUpdateMuxPortMuxState: Ethernet76: state db mux state: standby
/var/log/syslog.34.gz:Mar 23 19:03:42.929540 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:176 handleMuxStateNotification: Ethernet76: state db mux state: Standby
  • after a config mux active, link prober is reported active:
/var/log/syslog.32.gz:Mar 24 07:14:01.453764 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:1010 switchMuxState: Ethernet76: Switching MUX state to 'Active'
/var/log/syslog.32.gz:Mar 24 07:14:01.453764 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:253 handleMuxConfigNotification: Ethernet76: (P: Unknown, M: Standby, L: Up) -> (P: Unknown, M: Active, L: Up)
/var/log/syslog.32.gz:Mar 24 07:14:01.456502 NOTICE swss#orchagent: :- setState: [Ethernet76] Set MUX state from standby to active
/var/log/syslog.32.gz:Mar 24 07:14:01.457226 NOTICE swss#orchagent: :- nbrHandler: Processing neighbors for mux Ethernet76, enable 1, state 1
/var/log/syslog.32.gz:Mar 24 07:14:01.462369 NOTICE swss#orchagent: :- updateNeighbor: Processing update on neighbor 10.50.147.28 for mux Ethernet76, add 1, state 1
/var/log/syslog.32.gz:Mar 24 07:14:01.472146 WARNING mux#linkmgrd: DbInterface.cpp:259 postLinkProberMetricsEvent: Ethernet76: posting link prober event link_prober_active_start
/var/log/syslog.32.gz:Mar 24 07:14:01.472146 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:482 handleStateChange: Ethernet76: Received link prober event, new state: Active
/var/log/syslog.32.gz:Mar 24 07:14:01.472191 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:502 handleStateChange: Ethernet76: (P: Unknown, M: Active, L: Up) -> (P: Active, M: Active, L: Up)
/var/log/syslog.32.gz:Mar 24 07:14:01.472348 WARNING mux#linkmgrd: DbInterface.cpp:511 handlePostLinkProberMetrics: Ethernet76: posting link prober event link_prober_active_start
/var/log/syslog.32.gz:Mar 24 07:14:01.481870 NOTICE swss#orchagent: :- updateNeighbor: Processing update on neighbor 2603:10b0:d11:8618::a32:931c for mux Ethernet76, add 1, state 1
/var/log/syslog.32.gz:Mar 24 07:14:01.496055 NOTICE swss#orchagent: :- addOperation: Mux State set to active for port Ethernet76
/var/log/syslog.32.gz:Mar 24 07:14:01.505583 NOTICE swss#orchagent: :- addOperation: Mux setting State DB entry (hw state active, mux state active) for port Ethernet76
/var/log/syslog.32.gz:Mar 24 07:14:01.507186 INFO caclmgrd[1315]: mux cable update : '('Ethernet76', 'SET', (('state', 'active'),))'

analysis

  • there are two issues:
  1. linkmgrd does three toggles after initialization, the second toggle to active is unnecessary, and it is fixed by PR: #191
  2. the icmp probe sending is stopped after the first handleDefaultRouteStateNotification with default route na, but the second default route event ok doesn't start the probe sending as expected. After the config mux active, heartbeats replies are received so it could be guaranteed that the icmp probe sending flag is enabled(as handleMuxConfigNotification calls shutdownOrRestartLinkProberOnDefaultRoute to restart/stop the icmp probes).

Flaky unit test

2021-11-26T09:15:52.5608440Z [ RUN ] LinkManagerStateMachineTest.MuxStandbyLinkProberUnknownCliSwitchover
2021-11-26T09:15:52.5609383Z test/LinkManagerStateMachineTest.cpp:598: Failure
2021-11-26T09:15:52.5609926Z Expected equality of these values:
2021-11-26T09:15:52.5610425Z std::get<1>(mTestCompositeState)
2021-11-26T09:15:52.5610868Z Which is: 4
2021-11-26T09:15:52.5611504Z mux_state::MuxState::Label::Active
2021-11-26T09:15:52.5611958Z Which is: 0
2021-11-26T09:15:52.5612431Z test/LinkManagerStateMachineTest.cpp:602: Failure
2021-11-26T09:15:52.5612968Z Expected equality of these values:
2021-11-26T09:15:52.5613467Z std::get<1>(mTestCompositeState)
2021-11-26T09:15:52.5613866Z Which is: 0
2021-11-26T09:15:52.5614325Z mux_state::MuxState::Label::Wait
2021-11-26T09:15:52.5614774Z Which is: 4
2021-11-26T09:15:52.5615273Z [ FAILED ] LinkManagerStateMachineTest.MuxStandbyLinkProberUnknownCliSwitchover (627 ms)
2021-11-26T09:15:52.5615967Z [ RUN ] LinkManagerStateMachineTest.MuxStandbyLinkProberUnknownReturnStandby
2021-11-26T09:15:52.5616653Z [ OK ] LinkManagerStateMachineTest.MuxStandbyLinkProberUnknownReturnStandby (663 ms)
2021-11-26T09:15:52.5617260Z [ RUN ] LinkManagerStateMachineTest.MuxActiveAsymetricLinkDrop
2021-11-26T09:15:52.5617886Z [ OK ] LinkManagerStateMachineTest.MuxActiveAsymetricLinkDrop (663 ms)
2021-11-26T09:15:52.5618519Z [ RUN ] LinkManagerStateMachineTest.MuxStandbyAsymetricLinkDrop
2021-11-26T09:15:52.5619115Z [ OK ] LinkManagerStateMachineTest.MuxStandbyAsymetricLinkDrop (511 ms)
2021-11-26T09:15:52.5620128Z [ RUN ] LinkManagerStateMachineTest.ActiveStateToProberUnknownMuxUnknownLinkUp
2021-11-26T09:15:52.5620815Z [ OK ] LinkManagerStateMachineTest.ActiveStateToProberUnknownMuxUnknownLinkUp (725 ms)
2021-11-26T09:15:52.5621481Z [ RUN ] LinkManagerStateMachineTest.StandbyStateToProberUnknownMuxUnknownLinkUp
2021-11-26T09:15:52.5622172Z [ OK ] LinkManagerStateMachineTest.StandbyStateToProberUnknownMuxUnknownLinkUp (1061 ms)
2021-11-26T09:15:52.5622844Z [ RUN ] LinkManagerStateMachineTest.ProberUnknownMuxUnknownLinkDown
2021-11-26T09:15:52.5623465Z [ OK ] LinkManagerStateMachineTest.ProberUnknownMuxUnknownLinkDown (431 ms)
2021-11-26T09:15:52.5624109Z [ RUN ] LinkManagerStateMachineTest.ProberWaitMuxUnknownLinkDown
2021-11-26T09:15:52.5624750Z [ OK ] LinkManagerStateMachineTest.ProberWaitMuxUnknownLinkDown (1061 ms)
2021-11-26T09:15:52.5625386Z [ RUN ] LinkManagerStateMachineTest.MuxActive2Error2Active
2021-11-26T09:15:52.5625987Z [ OK ] LinkManagerStateMachineTest.MuxActive2Error2Active (210 ms)
2021-11-26T09:15:52.5626593Z [ RUN ] LinkManagerStateMachineTest.MuxActive2ErrorStandby
2021-11-26T09:15:52.5627163Z [ OK ] LinkManagerStateMachineTest.MuxActive2ErrorStandby (168 ms)
2021-11-26T09:15:52.5627770Z [ RUN ] LinkManagerStateMachineTest.MuxStandby2Error2Standby
2021-11-26T09:15:52.5628387Z [ OK ] LinkManagerStateMachineTest.MuxStandby2Error2Standby (109 ms)
2021-11-26T09:15:52.5628970Z [ RUN ] LinkManagerStateMachineTest.MuxStandby2ErrorActive
2021-11-26T09:15:52.5629759Z [ OK ] LinkManagerStateMachineTest.MuxStandby2ErrorActive (132 ms)
2021-11-26T09:15:52.5630363Z [ RUN ] LinkManagerStateMachineTest.MuxActive2Error2Unknown
2021-11-26T09:15:52.5630954Z [ OK ] LinkManagerStateMachineTest.MuxActive2Error2Unknown (220 ms)
2021-11-26T09:15:52.5632080Z [ RUN ] LinkManagerStateMachineTest.MuxStandby2Error2Unknown
2021-11-26T09:15:52.5632857Z [ OK ] LinkManagerStateMachineTest.MuxStandby2Error2Unknown (229 ms)
2021-11-26T09:15:52.5633427Z [ RUN ] LinkManagerStateMachineTest.MuxActive2Unknown2Error
2021-11-26T09:15:52.5634025Z [ OK ] LinkManagerStateMachineTest.MuxActive2Unknown2Error (126 ms)
2021-11-26T09:15:52.5642248Z [ RUN ] LinkManagerStateMachineTest.MuxStandby2Unknown2Error
2021-11-26T09:15:52.5642900Z [ OK ] LinkManagerStateMachineTest.MuxStandby2Unknown2Error (128 ms)
2021-11-26T09:15:52.5644059Z [----------] 32 tests from LinkManagerStateMachineTest (12901 ms total)
2021-11-26T09:15:52.5644358Z
2021-11-26T09:15:52.5645055Z [----------] 7 tests from LinkProberTest
2021-11-26T09:15:52.5645540Z [ RUN ] LinkProberTest.InitializeSendBuffer
2021-11-26T09:15:52.5646078Z [ OK ] LinkProberTest.InitializeSendBuffer (10 ms)
2021-11-26T09:15:52.5646597Z [ RUN ] LinkProberTest.CalculateChecksum
2021-11-26T09:15:52.5647100Z [ OK ] LinkProberTest.CalculateChecksum (15 ms)
2021-11-26T09:15:52.5647619Z [ RUN ] LinkProberTest.UpdateEthernetFrame
2021-11-26T09:15:52.5648129Z [ OK ] LinkProberTest.UpdateEthernetFrame (28 ms)
2021-11-26T09:15:52.5648607Z [ RUN ] LinkProberTest.UpdateSequenceNo
2021-11-26T09:15:52.5649334Z [ OK ] LinkProberTest.UpdateSequenceNo (12 ms)
2021-11-26T09:15:52.5649835Z [ RUN ] LinkProberTest.GenerateGuid
2021-11-26T09:15:52.5650447Z [ OK ] LinkProberTest.GenerateGuid (12 ms)
2021-11-26T09:15:52.5651489Z [ RUN ] LinkProberTest.UpdateToRMac
2021-11-26T09:15:52.5652286Z [ OK ] LinkProberTest.UpdateToRMac (26 ms)
2021-11-26T09:15:52.5652948Z [ RUN ] LinkProberTest.InitializeException
2021-11-26T09:15:52.5653600Z [ OK ] LinkProberTest.InitializeException (18 ms)
2021-11-26T09:15:52.5655061Z [----------] 7 tests from LinkProberTest (136 ms total)
2021-11-26T09:15:52.5655710Z
2021-11-26T09:15:52.5656393Z [----------] 8 tests from MuxManagerTest
2021-11-26T09:15:52.5656888Z [ RUN ] MuxManagerTest.AddPort
2021-11-26T09:15:52.5657841Z [ OK ] MuxManagerTest.AddPort (77 ms)
2021-11-26T09:15:52.5658470Z [ RUN ] MuxManagerTest.Loopback2Address
2021-11-26T09:15:52.5659149Z [ OK ] MuxManagerTest.Loopback2Address (120 ms)
2021-11-26T09:15:52.5659712Z [ RUN ] MuxManagerTest.Loopback2AddressException
2021-11-26T09:15:52.5660288Z [ OK ] MuxManagerTest.Loopback2AddressException (132 ms)
2021-11-26T09:15:52.5660824Z [ RUN ] MuxManagerTest.ToRMacAddress
2021-11-26T09:15:52.5661303Z [ OK ] MuxManagerTest.ToRMacAddress (190 ms)
2021-11-26T09:15:52.5661828Z [ RUN ] MuxManagerTest.ToRMacAddressException
2021-11-26T09:15:52.5662381Z [ OK ] MuxManagerTest.ToRMacAddressException (293 ms)
2021-11-26T09:15:52.5662892Z [ RUN ] MuxManagerTest.ServerMacAddress
2021-11-26T09:15:52.5663422Z [ OK ] MuxManagerTest.ServerMacAddress (158 ms)
2021-11-26T09:15:52.5664285Z [ RUN ] MuxManagerTest.ServerMacAddressException
2021-11-26T09:15:52.5664807Z [ OK ] MuxManagerTest.ServerMacAddressException (244 ms)
2021-11-26T09:15:52.5665333Z [ RUN ] MuxManagerTest.LinkmgrdConfig
2021-11-26T09:15:52.5665836Z [ OK ] MuxManagerTest.LinkmgrdConfig (287 ms)
2021-11-26T09:15:52.5666750Z [----------] 8 tests from MuxManagerTest (1510 ms total)
2021-11-26T09:15:52.5667024Z
2021-11-26T09:15:52.5667736Z [----------] 4 tests from MuxState/MuxResponseTest
2021-11-26T09:15:52.5668292Z [ RUN ] MuxState/MuxResponseTest.MuxResponse/0
2021-11-26T09:15:52.5668830Z [ OK ] MuxState/MuxResponseTest.MuxResponse/0 (112 ms)
2021-11-26T09:15:52.5669385Z [ RUN ] MuxState/MuxResponseTest.MuxResponse/1
2021-11-26T09:15:52.5669931Z [ OK ] MuxState/MuxResponseTest.MuxResponse/1 (132 ms)
2021-11-26T09:15:52.5670462Z [ RUN ] MuxState/MuxResponseTest.MuxResponse/2
2021-11-26T09:15:52.5671024Z [ OK ] MuxState/MuxResponseTest.MuxResponse/2 (122 ms)
2021-11-26T09:15:52.5672094Z [ RUN ] MuxState/MuxResponseTest.MuxResponse/3
2021-11-26T09:15:52.5672795Z [ OK ] MuxState/MuxResponseTest.MuxResponse/3 (118 ms)
2021-11-26T09:15:52.5673823Z [----------] 4 tests from MuxState/MuxResponseTest (501 ms total)
2021-11-26T09:15:52.5674134Z
2021-11-26T09:15:52.5674878Z [----------] 4 tests from MuxState/GetMuxStateTest
2021-11-26T09:15:52.5675602Z [ RUN ] MuxState/GetMuxStateTest.GetMuxState/0
2021-11-26T09:15:52.5676126Z [ OK ] MuxState/GetMuxStateTest.GetMuxState/0 (53 ms)
2021-11-26T09:15:52.5676668Z [ RUN ] MuxState/GetMuxStateTest.GetMuxState/1
2021-11-26T09:15:52.5677570Z [ OK ] MuxState/GetMuxStateTest.GetMuxState/1 (69 ms)
2021-11-26T09:15:52.5678095Z [ RUN ] MuxState/GetMuxStateTest.GetMuxState/2
2021-11-26T09:15:52.5678646Z [ OK ] MuxState/GetMuxStateTest.GetMuxState/2 (82 ms)
2021-11-26T09:15:52.5679189Z [ RUN ] MuxState/GetMuxStateTest.GetMuxState/3
2021-11-26T09:15:52.5679732Z [ OK ] MuxState/GetMuxStateTest.GetMuxState/3 (76 ms)
2021-11-26T09:15:52.5680898Z [----------] 4 tests from MuxState/GetMuxStateTest (294 ms total)
2021-11-26T09:15:52.5681208Z
2021-11-26T09:15:52.5681948Z [----------] 3 tests from AutoActiveManual/MuxConfigUpdateTest
2021-11-26T09:15:52.5682516Z [ RUN ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/0
2021-11-26T09:15:52.5683168Z [ OK ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/0 (161 ms)
2021-11-26T09:15:52.5683799Z [ RUN ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/1
2021-11-26T09:15:52.5684425Z [ OK ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/1 (183 ms)
2021-11-26T09:15:52.5685056Z [ RUN ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/2
2021-11-26T09:15:52.5685694Z [ OK ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/2 (253 ms)
2021-11-26T09:15:52.5686694Z [----------] 3 tests from AutoActiveManual/MuxConfigUpdateTest (597 ms total)
2021-11-26T09:15:52.5687023Z
2021-11-26T09:15:52.5687711Z [----------] Global test environment tear-down
2021-11-26T09:15:52.5688261Z [==========] 58 tests from 6 test cases ran. (15950 ms total)
2021-11-26T09:15:52.5689240Z [ PASSED ] 57 tests.
2021-11-26T09:15:52.5689660Z [ FAILED ] 1 test, listed below:
2021-11-26T09:15:52.5690238Z [ FAILED ] LinkManagerStateMachineTest.MuxStandbyLinkProberUnknownCliSwitchover
2021-11-26T09:15:52.5690537Z
2021-11-26T09:15:52.5690891Z 1 FAILED TEST
2021-11-26T09:15:52.5691703Z make[3]: *** [Makefile:84: test-targets] Error 1
2021-11-26T09:15:52.5693801Z make[3]: Leaving directory '/sonic/src/linkmgrd'
2021-11-26T09:15:52.5694511Z make[2]: *** [Makefile:91: test] Error 2
2021-11-26T09:15:52.5695518Z make[2]: Leaving directory '/sonic/src/linkmgrd'
2021-11-26T09:15:52.5696839Z dh_auto_test: error: make -j4 test returned exit code 2
2021-11-26T09:15:52.5697758Z make[1]: *** [debian/rules:4: build] Error 25
2021-11-26T09:15:52.5698716Z make[1]: Leaving directory '/sonic/src/linkmgrd'
2021-11-26T09:15:52.5699636Z dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2
2021-11-26T09:15:52.5701105Z [ FAIL LOG END ] [ target/debs/buster/sonic-linkmgrd_1.0.0-1_amd64.deb ]
2021-11-26T09:15:53.5535344Z make: *** [slave.mk:497: target/debs/buster/sonic-linkmgrd_1.0.0-1_amd64.deb] Error 1
2021-11-26T09:15:53.5536126Z make: *** Waiting for unfinished jobs....
2021-11-26T09:16:02.0694334Z [ finished ] [ target/python-wheels/sonic_yang_models-1.0-py3-none-any.whl ]
2021-11-26T09:17:03.2039119Z [ finished ] [ target/debs/buster/sonic-mgmt-common_1.0.0_amd64.deb ]
2021-11-26T09:19:11.8750826Z [ finished ] [ target/debs/buster/libsairedis_1.0.0_amd64.deb ]
2021-11-26T09:19:12.6511684Z DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
2021-11-26T09:19:38.5213841Z make[1]: *** [Makefile.work:304: buster] Error 2
2021-11-26T09:19:38.5215450Z make[1]: Leaving directory '/agent/_work/1/s'
2021-11-26T09:19:38.5220201Z make: *** [Makefile:32: target/sonic-mellanox.bin] Error 2
2021-11-26T09:19:38.5586660Z ##[error]Bash exited with code '2'.
2021-11-26T09:19:38.5633420Z ##[section]Finishing: Build sonic image

[active-active] mux config being ignored if first attempt of toggle fails

Issue

Issue was triggered by unexpected gRPC behavior, set requests didn't go through but read requests were responded.

Sep 20 19:26:05.146013 XXXXXXXXXXXXXXXXXXXX NOTICE pmon#ycable[323140]: calling RPC for hw mux_cable set state ispeer = False port Ethernet116 portid 1 read_side 1 state requested standby
Sep 20 19:26:05.149206 XXXXXXXXXXXXXXXXXXXX NOTICE pmon#ycable[323140]: response was none hw_mux_cable_table_grpc_notification Ethernet116
Sep 20 19:26:05.152855 XXXXXXXXXXXXXXXXXXXX NOTICE pmon#ycable[323140]: calling RPC for getting forwarding state port = Ethernet116 portid 1 peer portid 0 read_side 1
Sep 20 19:26:05.156604 XXXXXXXXXXXXXXXXXXXX NOTICE pmon#ycable[323140]: forwarding state RPC received response port = Ethernet116 portids [0, 1] read_side 1
Sep 20 19:26:05.156604 XXXXXXXXXXXXXXXXXXXX NOTICE pmon#ycable[323140]: forwarding state RPC received response port = Ethernet116 state values = [False, True] read_side 1

But in this case, if we switch to standby from CLI, linkmgrd won't trigger a second attempt after receiving forwarding state probe response, due to code block below:

void ActiveActiveStateMachine::LinkProberActiveMuxActiveLinkUpTransitionFunction(CompositeState &nextState)
{
MUXLOGINFO(mMuxPortConfig.getPortName());
if (ms(mCompositeState) == mux_state::MuxState::Unknown) {
switchMuxState(nextState, mux_state::MuxState::Label::Active);
}
}

Also as later toggles to active will also be ignored as state machine believes it's healthy active state. Thus, DB won't be updated, show mux status will return standby, unkown.

unstable linkmgrd test issue


[ RUN      ] LinkManagerStateMachineTest.MuxActive2Error2Active
[       OK ] LinkManagerStateMachineTest.MuxActive2Error2Active (374 ms)
[ RUN      ] LinkManagerStateMachineTest.MuxActive2ErrorStandby
test/LinkManagerStateMachineTest.cpp:846: Failure
Expected equality of these values:
  std::get<1>(mTestCompositeState)
    Which is: 4
  mux_state::MuxState::Label::Standby
    Which is: 1
[  FAILED  ] LinkManagerStateMachineTest.MuxActive2ErrorStandby (648 ms)
[ RUN      ] LinkManagerStateMachineTest.MuxStandby2Error2Standby
[       OK ] LinkManagerStateMachineTest.MuxStandby2Error2Standby (799 ms)
[ RUN      ] LinkManagerStateMachineTest.MuxStandby2ErrorActive
test/LinkManagerStateMachineTest.cpp:880: Failure
Expected equality of these values:
  std::get<1>(mTestCompositeState)
    Which is: 4
  mux_state::MuxState::Label::Active
    Which is: 0
[  FAILED  ] LinkManagerStateMachineTest.MuxStandby2ErrorActive (565 ms)
[ RUN      ] LinkManagerStateMachineTest.MuxActive2Error2Unknown
[       OK ] LinkManagerStateMachineTest.MuxActive2Error2Unknown (699 ms)
[ RUN      ] LinkManagerStateMachineTest.MuxStandby2Error2Unknown
[       OK ] LinkManagerStateMachineTest.MuxStandby2Error2Unknown (521 ms)
[ RUN      ] LinkManagerStateMachineTest.MuxActive2Unknown2Error
[       OK ] LinkManagerStateMachineTest.MuxActive2Unknown2Error (311 ms)
[ RUN      ] LinkManagerStateMachineTest.MuxStandby2Unknown2Error
[       OK ] LinkManagerStateMachineTest.MuxStandby2Unknown2Error (352 ms)
[----------] 31 tests from LinkManagerStateMachineTest (13839 ms total)

[----------] 7 tests from LinkProberTest
[ RUN      ] LinkProberTest.InitializeSendBuffer
[       OK ] LinkProberTest.InitializeSendBuffer (11 ms)
[ RUN      ] LinkProberTest.CalculateChecksum
[       OK ] LinkProberTest.CalculateChecksum (32 ms)
[ RUN      ] LinkProberTest.UpdateEthernetFrame
[       OK ] LinkProberTest.UpdateEthernetFrame (21 ms)
[ RUN      ] LinkProberTest.UpdateSequenceNo
[       OK ] LinkProberTest.UpdateSequenceNo (32 ms)
[ RUN      ] LinkProberTest.GenerateGuid
[       OK ] LinkProberTest.GenerateGuid (33 ms)
[ RUN      ] LinkProberTest.UpdateToRMac
[       OK ] LinkProberTest.UpdateToRMac (44 ms)
[ RUN      ] LinkProberTest.InitializeException
[       OK ] LinkProberTest.InitializeException (34 ms)
[----------] 7 tests from LinkProberTest (220 ms total)

[----------] 8 tests from MuxManagerTest
[ RUN      ] MuxManagerTest.AddPort
[       OK ] MuxManagerTest.AddPort (177 ms)
[ RUN      ] MuxManagerTest.Loopback2Address
[       OK ] MuxManagerTest.Loopback2Address (100 ms)
[ RUN      ] MuxManagerTest.Loopback2AddressException
[       OK ] MuxManagerTest.Loopback2AddressException (101 ms)
[ RUN      ] MuxManagerTest.ToRMacAddress
[       OK ] MuxManagerTest.ToRMacAddress (180 ms)
[ RUN      ] MuxManagerTest.ToRMacAddressException
[       OK ] MuxManagerTest.ToRMacAddressException (51 ms)
[ RUN      ] MuxManagerTest.ServerMacAddress
[       OK ] MuxManagerTest.ServerMacAddress (115 ms)
[ RUN      ] MuxManagerTest.ServerMacAddressException
[       OK ] MuxManagerTest.ServerMacAddressException (134 ms)
[ RUN      ] MuxManagerTest.LinkmgrdConfig
[       OK ] MuxManagerTest.LinkmgrdConfig (136 ms)
[----------] 8 tests from MuxManagerTest (1003 ms total)

[----------] 4 tests from MuxState/MuxResponseTest
[ RUN      ] MuxState/MuxResponseTest.MuxResponse/0
[       OK ] MuxState/MuxResponseTest.MuxResponse/0 (58 ms)
[ RUN      ] MuxState/MuxResponseTest.MuxResponse/1
[       OK ] MuxState/MuxResponseTest.MuxResponse/1 (66 ms)
[ RUN      ] MuxState/MuxResponseTest.MuxResponse/2
[       OK ] MuxState/MuxResponseTest.MuxResponse/2 (27 ms)
[ RUN      ] MuxState/MuxResponseTest.MuxResponse/3
[       OK ] MuxState/MuxResponseTest.MuxResponse/3 (31 ms)
[----------] 4 tests from MuxState/MuxResponseTest (210 ms total)

[----------] 4 tests from MuxState/GetMuxStateTest
[ RUN      ] MuxState/GetMuxStateTest.GetMuxState/0
[       OK ] MuxState/GetMuxStateTest.GetMuxState/0 (27 ms)
[ RUN      ] MuxState/GetMuxStateTest.GetMuxState/1
[       OK ] MuxState/GetMuxStateTest.GetMuxState/1 (30 ms)
[ RUN      ] MuxState/GetMuxStateTest.GetMuxState/2
[       OK ] MuxState/GetMuxStateTest.GetMuxState/2 (43 ms)
[ RUN      ] MuxState/GetMuxStateTest.GetMuxState/3
[       OK ] MuxState/GetMuxStateTest.GetMuxState/3 (56 ms)
[----------] 4 tests from MuxState/GetMuxStateTest (169 ms total)

[----------] 3 tests from AutoActiveManual/MuxConfigUpdateTest
[ RUN      ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/0
[       OK ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/0 (43 ms)
[ RUN      ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/1
[       OK ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/1 (29 ms)
[ RUN      ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/2
[       OK ] AutoActiveManual/MuxConfigUpdateTest.MuxPortConfigUpdate/2 (20 ms)
[----------] 3 tests from AutoActiveManual/MuxConfigUpdateTest (119 ms total)

[----------] Global test environment tear-down
[==========] 57 tests from 6 test cases ran. (15564 ms total)
[  PASSED  ] 55 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] LinkManagerStateMachineTest.MuxActive2ErrorStandby
[  FAILED  ] LinkManagerStateMachineTest.MuxStandby2ErrorActive

2 FAILED TESTS
make[3]: *** [Makefile:84: test-targets] Error 1
make[3]: Leaving directory '/sonic/src/linkmgrd'
make[2]: *** [Makefile:91: test] Error 2
make[2]: Leaving directory '/sonic/src/linkmgrd'
dh_auto_test: make -j4 test returned exit code 2
make[1]: *** [debian/rules:4: build] Error 2
make[1]: Leaving directory '/sonic/src/linkmgrd'
dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2

[active-active] Port being in `SERVER_STATUS` unknown after config reload

what is the issue?

After config reload, some ports SERVER_STATUS remains being in unknown.

# show mux s
PORT        STATUS    SERVER_STATUS    HEALTH    HWSTATUS      LAST_SWITCHOVER_TIME
----------  --------  ---------------  --------  ------------  ---------------------------
Ethernet4   active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.267624
Ethernet8   active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.267750
Ethernet12  active    active           healthy   consistent    2022-Oct-13 07:07:17.066053
Ethernet16  active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.267521
Ethernet20  active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.267112
Ethernet24  active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.267334
Ethernet28  active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.268106
Ethernet32  active    active           healthy   consistent    2022-Oct-13 07:07:17.086779
Ethernet36  active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.244156
Ethernet40  active    active           healthy   consistent    2022-Oct-13 07:07:17.089623
Ethernet44  active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.266907
Ethernet48  active    unknown          healthy   inconsistent  2022-Oct-13 07:05:06.267225

analysis

The syslog for Ethernet4:

Oct 13 07:07:06.280407 WARNING write_standby: Applying state to interfaces {'Ethernet12': 'active', 'Ethernet16': 'active', 'Ethernet20': 'active', 'Ethernet24': 'active', 'Ethernet28': 'active', 'Ethernet32': 'active', 'Ethernet36': 'active', 'Ethernet4': 'active', 'Ethernet40': 'active', 'Ethernet44': 'active', 'Ethernet48': 'active', 'Ethernet52': 'standby', 'Ethernet56': 'standby', 'Ethernet60': 'standby', 'Ethernet64': 'standby', 'Ethernet68': 'standby', 'Ethernet72': 'standby', 'Ethernet76': 'standby', 'Ethernet8': 'active', 'Ethernet80': 'standby', 'Ethernet84': 'standby', 'Ethernet88': 'standby', 'Ethernet92': 'standby', 'Ethernet96': 'standby'}
Oct 13 07:07:10.093135 NOTICE swss#orchagent: :- setState: [Ethernet4] Set MUX state from standby to active
Oct 13 07:07:10.097718 NOTICE swss#orchagent: :- addOperation: Mux State set to active for port Ethernet4
Oct 13 07:07:10.113299 NOTICE pmon#ycable[28]: calling RPC for hw mux_cable set state ispeer = False port Ethernet4 portid 1 read_side 1 state requested active
Oct 13 07:07:10.113299 NOTICE pmon#ycable[28]: response was none hw_mux_cable_table_grpc_notification Ethernet4
Oct 13 07:07:10.113299 WARNING pmon#ycable[28]: ERR: Got a change event for updating gRPC but could not toggle the mux-direction for port Ethernet4 state from unknown to active, writing unknown
Oct 13 07:07:10.312805 NOTICE swss#orchagent: :- addOperation: Mux setting State DB entry (hw state unknown, mux state unknown) for port Ethernet4
Oct 13 07:07:14.759053 WARNING mux#linkmgrd: MuxManager.cpp:222 updatePortCableType: Ethernet4: Port cable type: active-active
Oct 13 07:07:14.764550 WARNING mux#linkmgrd: MuxManager.cpp:162 addOrUpdateMuxPort: Ethernet4: server IP: 192.168.0.2
Oct 13 07:07:14.793861 WARNING mux#linkmgrd: MuxManager.cpp:185 addOrUpdateMuxPortSoCAddress: Ethernet4: SoC IP: 192.168.0.3
Oct 13 07:07:14.797145 WARNING mux#linkmgrd: MuxPort.cpp:434 handleTsaEnable: port: Ethernet4, configuring mux mode due to tsa_enable notification from CONFIG DB.
Oct 13 07:07:14.799921 WARNING mux#linkmgrd: MuxManager.cpp:207 updateMuxPortConfig: Ethernet4: Mux port config: auto
Oct 13 07:07:14.806406 WARNING mux#linkmgrd: MuxManager.cpp:262 addOrUpdateMuxPortLinkState: Ethernet4: link state: up
Oct 13 07:07:14.811004 WARNING mux#linkmgrd: MuxManager.cpp:288 addOrUpdateMuxPortMuxState: Ethernet4: state db mux state: unknown
Oct 13 07:07:14.858914 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:170 handleMuxStateNotification: Ethernet4: state db mux state: Unknown
Oct 13 07:07:14.858914 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:185 handleMuxStateNotification: Ethernet4: ycabled reports MUX state as 'Unknown' during init. phase! Is there a functioning gRPC server?
Oct 13 07:07:14.858960 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:76 activateStateMachine: Ethernet4: MUX port link prober initialized with server IP: 192.168.0.3, server MAC: 04:27:28:7a:00:04
Oct 13 07:07:14.859056 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:85 activateStateMachine: Ethernet4: (P: Wait, M: Wait, L: Up) -> (P: Wait, M: Wait, L: Up)
Oct 13 07:07:14.878965 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:789 setLabel: Ethernet4: Linkmgrd state is: Wait Unhealthy
Oct 13 07:07:15.235006 NOTICE pmon#ycable[28]: calling RPC for getting forwarding state port = Ethernet4 portid 1 peer portid 0 read_side 1
Oct 13 07:07:15.250583 NOTICE pmon#ycable[28]: forwarding state RPC received response port = Ethernet4 portids [0, 1] read_side 1
Oct 13 07:07:15.250583 NOTICE pmon#ycable[28]: forwarding state RPC received response port = Ethernet4 state values = [True, True] read_side 1
Oct 13 07:07:15.267701 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:257 handleProbeMuxStateNotification: Ethernet4: app db mux state: Active
Oct 13 07:07:15.267701 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:463 handleStateChange: Ethernet4: Received mux state event, new state: Active
Oct 13 07:07:15.494349 NOTICE pmon#ycable[28]: calling RPC for getting forwarding state port = Ethernet4 portid 1 peer portid 0 read_side 1
Oct 13 07:07:15.501449 NOTICE pmon#ycable[28]: forwarding state RPC received response port = Ethernet4 portids [0, 1] read_side 1
Oct 13 07:07:15.501449 NOTICE pmon#ycable[28]: forwarding state RPC received response port = Ethernet4 state values = [True, True] read_side 1
Oct 13 07:07:15.522767 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:257 handleProbeMuxStateNotification: Ethernet4: app db mux state: Active
Oct 13 07:07:15.881503 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:425 handleStateChange: Ethernet4: Received link prober event, new state: Active
Oct 13 07:07:15.881860 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:445 handleStateChange: Ethernet4: (P: Wait, M: Active, L: Up) -> (P: Active, M: Active, L: Up)
Oct 13 07:07:15.882010 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:789 setLabel: Ethernet4: Linkmgrd state is: Active Healthy
  • after linkmgrd boots up:
(wait, wait, up) --> mux notification unknown(`write_standby.py` toggling active fails), tries to mux probe -->
(wait, wait, up) --> link probe active, activate state machine, send heartbeats -->
(wait, active, up) --> receive heartbeats -->
(active, active, up)

But there is no more toggle requested by linkmgrd so MUX_CABLE_TABLE|Ethernet4 in STATE_DB still have state as unknown, which was the reponse from write_standby.py init toggle.

`make test` complains warnings about profile count data not found

New warnings after running make test:

src/common/State.cpp:44:1: warning: '/sonic/src/devel/fix_warnings/sonic-linkmgrd/src/common/State.gcda' profile count data file not found [-Wmissing-profile]
   44 | } /* namespace common */
      | ^
src/link_state/UpState.cpp: In function '(static initializers for src/link_state/UpState.cpp)':
src/link_state/UpState.cpp:95:1: warning: '/sonic/src/devel/fix_warnings/sonic-linkmgrd/src/link_state/UpState.gcda' profile count data file not found [-Wmissing-profile]
   95 | } /* namespace link_state */
      | ^
Finished building: src/common/State.cpp

src/link_prober/StandbyState.cpp: In function '(static initializers for src/link_prober/StandbyState.cpp)':
src/link_prober/StandbyState.cpp:124:1: warning: '/sonic/src/devel/fix_warnings/sonic-linkmgrd/src/link_prober/StandbyState.gcda' profile count data file not found [-Wmissing-profile]
  124 | } /* namespace link_prober */
      | ^
src/mux_state/MuxState.cpp: In function '(static initializers for src/mux_state/MuxState.cpp)':
src/mux_state/MuxState.cpp:46:1: warning: '/sonic/src/devel/fix_warnings/sonic-linkmgrd/src/mux_state/MuxState.gcda' profile count data file not found [-Wmissing-profile]
   46 | } /* namespace mux_state */
      | ^
Finished building: src/link_state/UpState.cpp

[active-standby] Unexpected toggle due to the oscillation logic

There is an unexpected toggle observed due to the oscillation.
The background is that, the oscillation timer is triggered if the mux is in (wait, active, up), and when it is expired, if the mux is still in (wait, active, up), linkmgrd will toggle to standby.
But as (wait, active, up) is an intermediate state between the toggle path from standby to active, so the oscillation timer could be triggered from a legit toggle from standby to active. Also, when the timer is expired in the period when mux is toggling from standby to active and the mux is in (wait, active, up), the mux will toggle back to standby. The extra toggle from the oscillation is disruptive here.

The whole story goes like the following:
image

[active-active] config to "auto" didn't trigger a toggle

Configuring healthy port from standby back to auto, didn't trigger a toggle to active.

:~$ show mux status Ethernet72
PORT        STATUS    SERVER_STATUS    HEALTH    HWSTATUS    LAST_SWITCHOVER_TIME
----------  --------  ---------------  --------  ----------  ----------------------
Ethernet72  standby   standby          healthy   consistent
Aug 16 20:44:23.568245 xxxxxxxxxxxxxxxxxxxx NOTICE swss#orchagent: :- handleMuxCfg: Ethernet72: 10.50.139.11 was added to ignored neighbor list
Aug 16 20:44:23.568701 xxxxxxxxxxxxxxxxxxxx WARNING mux#linkmgrd: MuxManager.cpp:207 updateMuxPortConfig: Ethernet72: Mux port config: auto
Aug 16 20:44:23.568923 xxxxxxxxxxxxxxxxxxxx WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:225 handleMuxConfigNotification: Ethernet72: (P: Active, M: Active, L: Up) -> (P: Active, M: Active, L: Up)
Aug 16 20:44:23.570029 xxxxxxxxxxxxxxxxxxxx NOTICE pmon#ycable[29]: calling RPC for getting forwarding state port = Ethernet72 portid 1 peer portid 0 read_side 1
Aug 16 20:44:23.571741 xxxxxxxxxxxxxxxxxxxx NOTICE pmon#ycable[29]: forwarding state RPC received response port = Ethernet72 portids [1, 0] read_side 1
Aug 16 20:44:23.571741 xxxxxxxxxxxxxxxxxxxx NOTICE pmon#ycable[29]: forwarding state RPC received response port = Ethernet72 state values = [False, True] read_side 1
Aug 16 20:44:23.572280 xxxxxxxxxxxxxxxxxxxx WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:241 handleProbeMuxStateNotification: Ethernet72: app db mux state: Active

[master] linkmgrd terminated after throwing an instance of 'std::out_of_range'

Issue

Mar 14 19:30:56.498083 str2-7050cx3-acs-08 WARNING mux#linkmgrd: MuxManager.cpp:208 addOrUpdateMuxPortLinkState: Ethernet106: link state: down 
Mar 14 19:30:56.498119 str2-7050cx3-acs-08 DEBUG mux#linkmgrd: link_state/LinkStateMachine.cpp:70 enterState: Ethernet4
Mar 14 19:30:56.498119 str2-7050cx3-acs-08 DEBUG mux#linkmgrd: link_state/UpState.cpp:91 resetState: Ethernet4
Mar 14 19:30:56.498141 str2-7050cx3-acs-08 INFO mux#/supervisord: linkmgrd terminate called after throwing an instance of 'std::out_of_range'
Mar 14 19:30:56.498154 str2-7050cx3-acs-08 INFO mux#/supervisord: linkmgrd   what():  map::at
Mar 14 19:30:56.784193 str2-7050cx3-acs-08 INFO mux#supervisord 2022-03-14 19:30:56,404 INFO success: linkmgrd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
Mar 14 19:30:56.802350 str2-7050cx3-acs-08 INFO mux#supervisord 2022-03-14 19:30:56,801 INFO exited: linkmgrd (terminated by SIGABRT (core dumped); not expected)
Mar 14 19:30:56.815402 str2-7050cx3-acs-08 INFO mux#/supervisor-proc-exit-listener: Process 'linkmgrd' exited unexpectedly. Terminating supervisor 'mux'

I have a feeling the error came from the line below, but haven't dived deeper into it. I thought port type should have been loaded at the moment of receiving swss link state notification?
https://github.com/lolyu/sonic-linkmgrd/blob/c779b8fbbb4239aeec6810a3e32b1d01031454c7/src/MuxManager.cpp#L335

How to reproduce

Replace docker-mux image with the master branch image, run sudo systemctl start mux.

Flaky test DbInterfaceRaceConditionCheck

[ RUN ] MuxManagerTest.DbInterfaceRaceConditionCheck
test/MuxManagerTest.cpp:1015: Failure
Value of: mDbInterfacePtr->mDbInterfaceRaceConditionCheckFailure
Actual: true
Expected: false
test/MuxManagerTest.cpp:1016: Failure
Expected equality of these values:
mDbInterfacePtr->mSetMuxStateInvokeCount
Which is: 0
i+1
Which is: 1
[ FAILED ] MuxManagerTest.DbInterfaceRaceConditionCheck (10157 ms)

[active-active][202205] link operational down didn't trigger toggle

# show interface status Ethernet120
  Interface            Lanes    Speed    MTU    FEC         Alias    Vlan    Oper    Admin             Type    Asym PFC
-----------  ---------------  -------  -----  -----  ------------  ------  ------  -------  ---------------  ----------
Ethernet120  121,122,123,124     100G   9100   none  Ethernet31/1   trunk    down       up  QSFP28 or later         off


# show mux sta Ethernet120
PORT         STATUS    SERVER_STATUS    HEALTH     HWSTATUS    LAST_SWITCHOVER_TIME
-----------  --------  ---------------  ---------  ----------  ---------------------------
Ethernet120  active    active           unhealthy  consistent  2023-Feb-21 11:14:36.277277

And mux config is auto

[active-active] Peer toggle issue

what is the issue

The flaky testcase test_link_drop.py::test_active_link_drop_upstream fails showing that there is a discruption for upstream traffic after simulating link drop for upstream traffic from the mux server to the upper ToR.

analysis

The root cause is that, once there is a link drop for upstream traffic from mux server to upper ToR, the upper ToR will receive no more heartbeats, neither from itself or its peer. The upper ToR will toggle itself, and will toggle peer if the peer heartbeat missing is detected first(the upper ToR is still healthy). In theory, those togglings should be failed since the upstream link is dropping packets.
But in fact, the toggling requests could reach the gRPC server(nic_simulator) and the request could be processed by nic_simulator. Hereby the issue, if the peer heartbeat missing is detected first, linkmgrd will request toggling peer, then request toggling itself, those two togglings will make both ToRs unreachable from mux server.

the logs

logs showing toggling peer first, then itself
  • The syslog from upper ToR showing peer heartbeat missing is reported first:
Oct 12 10:46:10.591060 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:524 handlePeerStateChange: Ethernet8: Received peer link prober event, new state: PeerUnknown
Oct 12 10:46:10.591133 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:909 switchPeerMuxState: Ethernet8: Switching peer MUX state to 'Standby'
Oct 12 10:46:11.098198 NOTICE pmon#ycable[29]: calling RPC for hw mux_cable set state ispeer = True port Ethernet8 portid 0 read_side 1 state requested standby
Oct 12 10:46:11.590773 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:425 handleStateChange: Ethernet8: Received link prober event, new state: Unknown
Oct 12 10:46:11.590877 WARNING mux#linkmgrd: DbInterface.cpp:259 postLinkProberMetricsEvent: Ethernet8: posting link prober event link_prober_unknown_start
Oct 12 10:46:11.590933 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:881 switchMuxState: Ethernet8: Switching MUX state to 'Standby'
Oct 12 10:46:11.591185 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:445 handleStateChange: Ethernet8: (P: Active, M: Active, L: Up) -> (P: Unknown, M: Standby, L: Up)
Oct 12 10:46:11.591185 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:789 setLabel: Ethernet8: Linkmgrd state is: Standby Unhealthy
Oct 12 10:46:11.591556 WARNING mux#linkmgrd: DbInterface.cpp:511 handlePostLinkProberMetrics: Ethernet8: posting link prober event link_prober_unknown_start
Oct 12 10:46:11.593097 NOTICE swss#orchagent: :- setState: [Ethernet8] Set MUX state from active to standby
Oct 12 10:46:11.599741 NOTICE pmon#ycable[29]: response was none hw_mux_cable_table_grpc_notification Ethernet8
Oct 12 10:46:11.599880 WARNING pmon#ycable[29]: ERR: Got a change event for updating gRPC but could not toggle the mux-direction for port Ethernet8 state from unknown to standby, writing unknown
Oct 12 10:46:11.600643 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:323 handlePeerMuxStateNotification: Ethernet8: state db mux state: Unknown
Oct 12 10:46:11.603653 NOTICE swss#orchagent: :- addOperation: Mux State set to standby for port Ethernet8
Oct 12 10:46:12.105649 NOTICE pmon#ycable[29]: calling RPC for hw mux_cable set state ispeer = False port Ethernet8 portid 1 read_side 1 state requested standby
Oct 12 10:46:12.608223 NOTICE pmon#ycable[29]: response was none hw_mux_cable_table_grpc_notification Ethernet8
Oct 12 10:46:12.608644 WARNING pmon#ycable[29]: ERR: Got a change event for updating gRPC but could not toggle the mux-direction for port Ethernet8 state from active to standby, writing unknown
Oct 12 10:46:12.611272 NOTICE swss#orchagent: :- addOperation: Mux setting State DB entry (hw state unknown, mux state unknown) for port Ethernet8
Oct 12 10:46:12.611877 WARNING mux#linkmgrd: MuxManager.cpp:288 addOrUpdateMuxPortMuxState: Ethernet8: state db mux state: unknown
Oct 12 10:46:12.612204 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:170 handleMuxStateNotification: Ethernet8: state db mux state: Unknown
Oct 12 10:46:12.612246 WARNING mux#linkmgrd: link_manager/LinkManagerStateMachineActiveActive.cpp:463 handleStateChange: Ethernet8: Received mux state event, new state: Unknown
  • the nic_simulator logs, it shows that if self heartbeats missing comes first, and only one toggle(self) will be requested:
2022-10-12 10:46:08,754 set_drop             INFO  #0512| Set drop on bridge baa-svc1-3-3: portids=[1], directions=[1], recover=False4
2022-10-12 10:46:08,928 query_forwarding_sta INFO  #0507| Query bridge baa-svc1-3-3 forwarding state for ports [0, 1]: ('ACTIVE', 'STANDBY')
2022-10-12 10:46:11,610 set_forwarding_state INFO  #0498| Set bridge baa-svc1-3-3 port 0 forwarding state: STANDBY
2022-10-12 10:46:11,649 query_forwarding_sta INFO  #0507| Query bridge baa-svc1-3-3 forwarding state for ports [0]: ('STANDBY',)                        // toggling peer, the lower ToR
2022-10-12 10:46:16,441 set_forwarding_state INFO  #0498| Set bridge baa-svc1-3-3 port 1 forwarding state: STANDBY
2022-10-12 10:46:16,506 query_forwarding_sta INFO  #0507| Query bridge baa-svc1-3-3 forwarding state for ports [1]: ('STANDBY',)                        // toggling itself, the upper ToR
logs shows only toggling self
  • the nic_simulator logs that shows the toggling self is processed:
2022-10-12 10:46:10,110 set_drop             INFO  #0512| Set drop on bridge baa-svc1-3-5: portids=[1], directions=[1], recover=False
2022-10-12 10:46:10,329 query_forwarding_sta INFO  #0507| Query bridge baa-svc1-3-5 forwarding state for ports [0, 1]: ('ACTIVE', 'STANDBY')
2022-10-12 10:46:17,900 set_forwarding_state INFO  #0498| Set bridge baa-svc1-3-5 port 1 forwarding state: STANDBY                                      // toggling itself, the upper ToR
2022-10-12 10:46:17,961 query_forwarding_sta INFO  #0507| Query bridge baa-svc1-3-5 forwarding state for ports [1]: ('STANDBY',)

[linkmgrd] No debug symbols included in the `docker-mux-dbg` `linkmgrd` binary

As the subject.

  • 4a470f7851a3 is docker-mux-dbg container id.
$ docker exec -it 4a470f7851a3 bash
root@4a470f7851a3:/# md5sum /usr/sbin/linkmgrd
82456cb93e52686682eae60844a3ea9d  /usr/sbin/linkmgrd
root@4a470f7851a3:/#
exit
$ docker exec -it mux bash
root@svcstr-7050-acs-1:/# md5sum /usr/sbin/linkmgrd
82456cb93e52686682eae60844a3ea9d  /usr/sbin/linkmgrd

[active-active] icmp self event is reported twice in a sending interval

log:

[2022-10-21 02:31:42.709347] [debug] link_prober/LinkProber.cpp:174 startProbing: Ethernet12
[2022-10-21 02:31:42.709566] [debug] link_prober/LinkProber.cpp:585 startTimer: Ethernet12
[2022-10-21 02:31:42.709610] [debug] link_prober/LinkProber.cpp:930 getProbingInterval: Ethernet12
[2022-10-21 02:31:42.709665] [debug] link_prober/LinkProber.cpp:377 handleRecv: Ethernet12
[2022-10-21 02:31:42.711263] [debug] link_prober/LinkProber.cpp:377 handleRecv: Ethernet12
[2022-10-21 02:31:42.711332] [debug] link_prober/ActiveState.cpp:77 handleEvent: Ethernet12
[2022-10-21 02:31:42.711374] [debug] link_prober/ActiveState.cpp:117 resetState: Ethernet12
[2022-10-21 02:31:43.374245] [debug] link_prober/LinkProber.cpp:377 handleRecv: Ethernet12
[2022-10-21 02:31:43.374361] [debug] link_prober/ActiveState.cpp:77 handleEvent: Ethernet12
[2022-10-21 02:31:43.374525] [debug] link_prober/ActiveState.cpp:117 resetState: Ethernet12
[2022-10-21 02:31:43.374611] [debug] link_prober/PeerActiveState.cpp:46 handleEvent: Ethernet12
[2022-10-21 02:31:43.374670] [debug] link_prober/PeerActiveState.cpp:83 resetState: Ethernet12

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.