Code Monkey home page Code Monkey logo

freebsd-src's Introduction

Archived repository note

Everything has its end. This project does as well.
MCUSim will be archived from now on as I don't have enough time to develop it further.
Feel free to fork it. If you'll be interested in plans and development ideas - drop me a note via email. I'll share them gladly. Good luck!


Readme for MCUSim

MCUSim is a digital simulator and NGSpice library with microcontrollers. It is created to assist in circuit simulation, firmware debugging, testing and signal tracing.

Please note that the 8-bit AVR (RISC) microcontrollers are aimed at the moment. Feel free to start a discussion about any other family or architecture of the microcontrollers.

Quick start

$ git clone https://github.com/mcusim/MCUSim.git
$ cd MCUSim
$ mkdir build && cd build
$ cmake -DWITH_UNIT_TESTS=True -DWITH_XSPICE=True ..
$ make && make check && make tests
$ make install

Screenshots

Description

There is an mcusim.conf configuration file installed together with the mcusim binary and libmsim which can be used to tweak the program.

The best way to prepare your own simulation is to copy mcusim.conf to a new directory, adjust the options and run the simulator. Firmware and Lua files should also be placed there.

Lua scripts can be used to substitute models of the real devices during a simulation process. They may affect state of the chip in several ways, e.g. access registers, generate signals or terminate MCU.

Scripts are supposed to be external devices connected to the MCU of the simulated circuit (external EEPROM, humidity sensor, MOSFET switch, etc).

Registers of the simulated MCU can be saved into a VCD (value change dump) file and read using GTKWave viewer.

How can I start a discussion?

Feel free to ask questions and start a discussion in a mailing list for developers. Just subscribe and send a letter.

How can I join the development?

You may drop a note in the mailing list first or just code the feature you want to add and share your patch there. Before you start coding check the latest development release of MCUSim from our git repository or try to find a ticket at https://trac.mcusim.org/report. It might be that your feature has already been implemented.

There is no bureaucracy here.

Mailing list

Web sites

Source code is hosted at https://github.com/mcusim/MCUSim.
Wiki and issue tracker are at https://trac.mcusim.org.
Mailing list is at https://www.freelists.org/list/mcusim-dev.

freebsd-src's People

Contributors

alcriceedu avatar amotin avatar avg-i avatar bapt avatar bdrewery avatar behlendorf avatar brooksdavis avatar bsdimp avatar bsdjhb avatar bsdphk avatar dag-erling avatar darkhelmet433 avatar delphij avatar dimitryandric avatar emaste avatar glebius avatar gwollman avatar hselasky avatar juikim avatar kevans91 avatar kostikbel avatar markjdb avatar mjguzik avatar ngie-eign avatar rwatson avatar sleffler avatar sparcplug avatar trasz avatar tuexen avatar zxombie avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

bsder mcbridematt

freebsd-src's Issues

Multicast

Trying to use IPv6 it seemed MC was not working reliably

"Failed to pull frames" when using multiple DPNIs

Hardware: Ten64
MC firmware: 10.20
Commit: 6efa7d1

When more than one interface / DPNI is transferring data, the following errors appear in the system console / dmesg:

dpaa2_ni0: failed to pull frames: chan_id=15, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni0: failed to pull frames: chan_id=15, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni0: failed to pull frames: chan_id=15, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni0: failed to pull frames: chan_id=15, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni0: failed to pull frames: chan_id=15, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16
dpaa2_ni1: failed to pull frames: chan_id=23, error=16

An example use case is when the system is being used as a router between two network interfaces.
I don't see any evidence of packet loss which is good.

This message is printed by dpaa2_ni_poll_task around line 2367:

error = dpaa2_swp_pull(swp, chan->id, chan->store.paddr,
ETH_STORE_FRAMES);
if (error) {
device_printf(chan->ni_dev, "failed to pull frames: "
"chan_id=%d, error=%d\n", chan->id, error);
break;
}

VLAN_MTU

dpni0 announces vlan_mtu but creating a vlan interface on top of it create mtu issues; doing ifconfig vlan0 mtu 1480 as a check makes things work. So something is not correct.

Use DPMCPs for communicating with MC / fix VFIO guest

Under VFIO passthrough for DPAA2, most MC commands have to be run through DPMCP's instead of the DPRC root container.

This requirement is enforced by QEMU which has a security filter:
https://source.codeaurora.org/external/qoriq/qoriq-components/qemu/tree/hw/vfio/fsl_mc.c?h=integration&id=14fda5a42df6c72e890d6a97ff88c5852172604b#n688

If you attempt to do DPBP, DPIO or most other object instructions through the DPRC, you get a 0x3 (authentication error) response generated by QEMU:

dpaa2_mc0: mem 0x4040000000-0x404000ffff on ofwbus0
dpaa2_rc0: on dpaa2_mc0
dpaa2_rc0: MC firmware version: 10.20.4
dpaa2_bp0: at dpbp (id=0) on dpaa2_rc0
dpaa2_bp0: Failed to reset DPBP: id=0, error=3
device_attach: dpaa2_bp0 attach returned 6
dpaa2_io0: <DPAA2 I/O> iomem 0x4048000000-0x404800ffff,0x4044000000-0x404400ffff at dpio (id=0) on dpaa2_rc0
dpaa2_io0: Failed to reset DPIO: id=0, error=3
device_attach: dpaa2_io0 attach returned 6
dpaa2_con0: at dpcon (id=2) on dpaa2_rc0
dpaa2_con0: Failed to reset DPCON: id=2, error=3
device_attach: dpaa2_con0 attach returned 6
dpaa2_con1: at dpcon (id=0) on dpaa2_rc0
dpaa2_con1: Failed to reset DPCON: id=0, error=3

dpaa2_ni_tx_task: can't load TX buffer: error=27

Hello,

On a traverse ten64 , I installed FreeBSD in different ways:

  • building from you sources
  • building from your sources + rebasing main
  • using the baremetal provided by traverse here.

The network is set as:
/etc/rc.conf.d/network

ifconfig_mlxen0="DHCP"
ifconfig_mlxen0_ipv6="inet6 accept_rtadv up"

Nothing else fancy, no other network interface is configured/used.

I rsynced multiple times a 40GB file without any problem.
But each time I run a verbose command such as
ssh root@traverse_machine dmesg

I instantly lose connection and can see in the serial console "dpaa2_ni_tx_task: can't load TX buffer: error=27" repeatedly.
Then, I can do nothing but reboot the appliance.

MAC filter failures

dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3

Just got this during bootup. Not seen it before.

Build failure without INVARIANTS (includes suggested patch)

When you try and build a kernel without INVARIANTS, such as a NODEBUG kernel, compilation fails with:

--- all_subdir_dpaa2 ---
/usr/src/sys/dev/dpaa2/dpaa2_swp.c:1034:5: error: variable 'r' set but not used [-Werror,-Wunused-but-set-variable]
        } *r;
           ^
1 error generated.
*** [dpaa2_swp.o] Error code 1

make[4]: stopped in /usr/src/sys/modules/dpaa2
1 error

make[4]: stopped in /usr/src/sys/modules/dpaa2

A possible fix is to surround by #if ... both the definition and the assignment of variable r.

In this case I took the #if ... from sys/sys/kassert.h, where KASSERT is also defined:


diff --git a/sys/dev/dpaa2/dpaa2_swp.c b/sys/dev/dpaa2/dpaa2_swp.c
index 200beb8dce57..c2b231826d38 100644
--- a/sys/dev/dpaa2/dpaa2_swp.c
+++ b/sys/dev/dpaa2/dpaa2_swp.c
@@ -1028,10 +1028,12 @@ static int
 dpaa2_swp_exec_mgmt_command(struct dpaa2_swp *swp, struct dpaa2_swp_cmd *cmd,
     struct dpaa2_swp_rsp *rsp, uint8_t cmdid)
 {
+#if (defined(_KERNEL) && defined(INVARIANTS)) || defined(_STANDALONE)
        struct __packed with_verb {
                uint8_t verb;
                uint8_t _reserved[63];
        } *r;
+#endif
        uint16_t flags;
        int error;
 
@@ -1057,7 +1059,9 @@ dpaa2_swp_exec_mgmt_command(struct dpaa2_swp *swp, struct dpaa2_swp_cmd *cmd,
        }
        dpaa2_swp_unlock(swp);
 
+#if (defined(_KERNEL) && defined(INVARIANTS)) || defined(_STANDALONE)
        r = (struct with_verb *) rsp;
+#endif
        KASSERT((r->verb & CMD_VERB_MASK) == cmdid,
            ("wrong VERB byte in response: resp=0x%02x, expected=0x%02x",
            r->verb, cmdid));

With the patch the NODEBUG kernel builds and after a two way transfer of ~1G in both directions (at ~ 960mbps):

# sysctl dev.dpaa2_ni.0 && netstat -i -b -n -I dpni0
dev.dpaa2_ni.0.stats.in_all_frames: 1087188
dev.dpaa2_ni.0.stats.in_all_bytes: 1120373738
dev.dpaa2_ni.0.stats.in_multi_frames: 70
dev.dpaa2_ni.0.stats.eg_all_frames: 1086856
dev.dpaa2_ni.0.stats.eg_all_bytes: 1120347810
dev.dpaa2_ni.0.stats.eg_multi_frames: 0
dev.dpaa2_ni.0.stats.in_filtered_frames: 0
dev.dpaa2_ni.0.stats.in_discarded_frames: 0
dev.dpaa2_ni.0.stats.in_nobuf_discards: 0
dev.dpaa2_ni.0.stats.rx_ieoi_err_frames: 0
dev.dpaa2_ni.0.stats.rx_enq_rej_frames: 0
dev.dpaa2_ni.0.stats.rx_sg_buf_frames: 0
dev.dpaa2_ni.0.stats.rx_single_buf_frames: 1087192
dev.dpaa2_ni.0.stats.rx_anomaly_frames: 0
dev.dpaa2_ni.0.channels.15.tx_dropped: 0
dev.dpaa2_ni.0.channels.15.tx_frames: 0
dev.dpaa2_ni.0.channels.14.tx_dropped: 0
dev.dpaa2_ni.0.channels.14.tx_frames: 0
dev.dpaa2_ni.0.channels.13.tx_dropped: 0
dev.dpaa2_ni.0.channels.13.tx_frames: 0
dev.dpaa2_ni.0.channels.12.tx_dropped: 0
dev.dpaa2_ni.0.channels.12.tx_frames: 0
dev.dpaa2_ni.0.channels.11.tx_dropped: 0
dev.dpaa2_ni.0.channels.11.tx_frames: 0
dev.dpaa2_ni.0.channels.10.tx_dropped: 0
dev.dpaa2_ni.0.channels.10.tx_frames: 0
dev.dpaa2_ni.0.channels.9.tx_dropped: 0
dev.dpaa2_ni.0.channels.9.tx_frames: 0
dev.dpaa2_ni.0.channels.8.tx_dropped: 0
dev.dpaa2_ni.0.channels.8.tx_frames: 0
dev.dpaa2_ni.0.channels.7.tx_dropped: 0
dev.dpaa2_ni.0.channels.7.tx_frames: 0
dev.dpaa2_ni.0.channels.6.tx_dropped: 0
dev.dpaa2_ni.0.channels.6.tx_frames: 0
dev.dpaa2_ni.0.channels.5.tx_dropped: 0
dev.dpaa2_ni.0.channels.5.tx_frames: 0
dev.dpaa2_ni.0.channels.4.tx_dropped: 0
dev.dpaa2_ni.0.channels.4.tx_frames: 0
dev.dpaa2_ni.0.channels.3.tx_dropped: 0
dev.dpaa2_ni.0.channels.3.tx_frames: 0
dev.dpaa2_ni.0.channels.2.tx_dropped: 0
dev.dpaa2_ni.0.channels.2.tx_frames: 0
dev.dpaa2_ni.0.channels.1.tx_dropped: 0
dev.dpaa2_ni.0.channels.1.tx_frames: 0
dev.dpaa2_ni.0.channels.0.tx_dropped: 0
dev.dpaa2_ni.0.channels.0.tx_frames: 1086897
dev.dpaa2_ni.0.%parent: dpaa2_rc0
dev.dpaa2_ni.0.%pnpinfo: 
dev.dpaa2_ni.0.%location: 
dev.dpaa2_ni.0.%driver: dpaa2_ni
dev.dpaa2_ni.0.%desc: DPAA2 Network Interface
Name    Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
dpni0  1500 <Link#1>      82:e3:3f:86:00:11        0     0     0 1120375388        0     0          0     0
dpni0     - 192.168.1.0/2 192.168.1.52       1087085     -     - 1105137104  1086901     - 1105137884     -

no traffic flows on latest ten64 branch

src 5f6b8b3 tip of ten64 branch.

Not quite sure how to characterise this bug. I see inbound IP traffic but nothing
makes it out an interface, nor across between interfaces. pf is running, route
appears to select the correct interface. Just nothing gets out.

# ifconfig
dpni1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8002b<RXCSUM,TXCSUM,VLAN_MTU,JUMBO_MTU,LINKSTATE>
        ether 00:0a:fa:24:2b:16
        inet 172.16.1.1 netmask 0xffffff00 broadcast 172.16.1.255
        inet6 fe80::20a:faff:fe24:2b16%dpni1 prefixlen 64 scopeid 0x2
        inet6 2a02:ab8:201:14a0::1 prefixlen 128
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
dpni2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8002b<RXCSUM,TXCSUM,VLAN_MTU,JUMBO_MTU,LINKSTATE>
        ether 00:0a:fa:24:2b:17
        inet 172.16.2.1 netmask 0xffffff00 broadcast 172.16.2.255
        media: Ethernet autoselect (1000baseT <full-duplex,master>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
...
  • still see arp
root@continuity:~ # arp -a
? (172.16.3.1) at 00:0a:fa:24:2b:18 on dpni3 permanent [ethernet]
? (172.16.2.28) at 7c:9e:bd:e0:1f:4c on dpni2 expires in 1088 seconds [ethernet]
? (172.16.2.4) at f0:9f:c2:17:e4:3c on dpni2 expires in 1080 seconds [ethernet]
? (172.16.2.37) at (incomplete) on dpni2 expired [ethernet]
? (172.16.2.5) at 00:03:ac:41:53:52 on dpni2 expires in 1101 seconds [ethernet]
? (172.16.2.2) at 80:2a:a8:83:e2:a3 on dpni2 expires in 1114 seconds [ethernet]
? (172.16.2.3) at 80:2a:a8:59:bd:3f on dpni2 expires in 1110 seconds [ethernet]
? (172.16.2.1) at 00:0a:fa:24:2b:17 on dpni2 permanent [ethernet]
? (172.16.1.5) at b8:59:9f:1a:82:26 on dpni1 expires in 1082 seconds [ethernet]
? (172.16.1.4) at ac:1f:6b:67:e1:38 on dpni1 expires in 1079 seconds [ethernet]
? (172.16.1.1) at 00:0a:fa:24:2b:16 on dpni1 permanent [ethernet]
  • and tcpdump sees udp & bootp requests as well, coming in:
00:04:39.287011 IP (tos 0x0, ttl 64, id 13090, offset 0, flags [DF], proto UDP (17), length 273)
    172.16.2.2.42964 > 255.255.255.255.10001: [udp sum ok] UDP, length 245
00:04:39.288209 IP6 (flowlabel 0x63218, hlim 1, next-header UDP (17) payload length: 253) fe80::822a:a8ff:fe83:e2a3.44188 > ff02::1.10001: [udp sum ok] UDP, length 245
00:04:41.650980 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 252)
    172.16.2.4.50424 > 255.255.255.255.10001: [udp sum ok] UDP, length 224
00:04:41.651607 IP6 (hlim 1, next-header UDP (17) payload length: 232) fe80::f29f:c2ff:fe17:e43c.44241 > ff02::1.10001: [udp sum ok] UDP, length 224
00:04:44.214711 IP (tos 0x0, ttl 64, id 32404, offset 0, flags [DF], proto UDP (17), length 274)
    192.168.1.20.55774 > 255.255.255.255.10001: [udp sum ok] UDP, length 246
00:04:44.215969 IP6 (flowlabel 0xbcec0, hlim 1, next-header UDP (17) payload length: 254) fe80::822a:a8ff:fe59:bd3f.46677 > ff02::1.10001: [udp sum ok] UDP, length 246
00:04:51.671240 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 252)
    172.16.2.4.53764 > 255.255.255.255.10001: [udp sum ok] UDP, length 224
00:04:51.673338 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 328)
    0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 80:2a:a8:59:bd:3f (oui Unknown), length 300, xid 0x6b4285a, secs 122, Flags [none] (0x0000)
          Client-Ethernet-Address 80:2a:a8:59:bd:3f (oui Unknown)
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            Client-ID Option 61, length 7: ether 80:2a:a8:59:bd:3f
            Requested-IP Option 50, length 4: 172.16.2.3
            MSZ Option 57, length 2: 576
            Parameter-Request Option 55, length 8:
              Subnet-Mask, Default-Gateway, Domain-Name-Server, Hostname
              Domain-Name, BR, NTP, Vendor-Option
            Vendor-Class Option 60, length 4: "ubnt"
            Hostname Option 12, length 7: "terrace"
            END Option 255, length 0
            PAD Option 0, length 0, occurs 12

but nothing goes out:

root@continuity:~ # vmstat -i | grep dpaa
its0,33: dpaa2_mac0                                       1          0
its0,34: dpaa2_mac1                                       2          0
its0,35: dpaa2_mac2                                       2          0
its0,37: dpaa2_mac4                                       2          0
its0,43: dpaa2_io0                                       17          0
its0,44: dpaa2_io1                                      449          1
its0,45: dpaa2_io2                                     1122          4
its0,46: dpaa2_io3                                     1111          4
its0,47: dpaa2_io4                                     1061          4
its0,48: dpaa2_io5                                       90          0
its0,49: dpaa2_io6                                     1013          3
its0,50: dpaa2_io7                                      359          1
its0,52: dpaa2_ni1                                        1          0
its0,53: dpaa2_ni2                                        1          0
its0,58: dpaa2_ni7                                        1          0

root@continuity:~ # ping -fc 1000 172.16.1.4

root@continuity:~ # vmstat -i | grep dpaa
its0,33: dpaa2_mac0                                       1          0
its0,34: dpaa2_mac1                                       2          0
its0,35: dpaa2_mac2                                       2          0
its0,37: dpaa2_mac4                                       2          0
its0,43: dpaa2_io0                                       17          0
its0,44: dpaa2_io1                                      449          1
its0,45: dpaa2_io2                                     1430          4
its0,46: dpaa2_io3                                     1392          4
its0,47: dpaa2_io4                                     1353          4
its0,48: dpaa2_io5                                       90          0
its0,49: dpaa2_io6                                     1320          3
its0,50: dpaa2_io7                                      359          1
its0,52: dpaa2_ni1                                        1          0
its0,53: dpaa2_ni2                                        1          0
its0,58: dpaa2_ni7                                        1          0

root@continuity:~ # sysctl dev.dpaa2_ni.1
dev.dpaa2_ni.1.stats.in_all_frames: 15310
dev.dpaa2_ni.1.stats.in_all_bytes: 1346077
dev.dpaa2_ni.1.stats.in_multi_frames: 619
dev.dpaa2_ni.1.stats.eg_all_frames: 160
dev.dpaa2_ni.1.stats.eg_all_bytes: 24574
dev.dpaa2_ni.1.stats.eg_multi_frames: 6
dev.dpaa2_ni.1.stats.in_filtered_frames: 3
dev.dpaa2_ni.1.stats.in_discarded_frames: 0
dev.dpaa2_ni.1.stats.in_nobuf_discards: 0
dev.dpaa2_ni.1.stats.buf_free: 1392
dev.dpaa2_ni.1.stats.buf_num: 11200
dev.dpaa2_ni.1.%parent: dpaa2_rc0
dev.dpaa2_ni.1.%pnpinfo:
dev.dpaa2_ni.1.%location:
dev.dpaa2_ni.1.%driver: dpaa2_ni
dev.dpaa2_ni.1.%desc: DPAA2 Network Interface

Working in ACPI mode

I noticed some interesting Ethernet behavior on HoneyCombLX2K in ACPI mode.
I work in this dpaa2 Ethernet mode, but I noticed strange behavior.

That's what's happening.
I'm building my package repository on HoneyComb.
When trying to install bulk packages from this repository,
a message appears about an incorrect checksum when downloading packages.

If we use USB everything works without errors.

Is it possible to fix this somehow?
I'm using FreeBSD-14.0-RELEASE.

ten64: Slow or stall enumerating DPNIs on boot

I tried to run the latest code but am stuck at this problem.

At boot it there is a very long pause (or stall) while enumerating the DPNIs:

[2022-06-25 22:04:26.817] dpaa2_rc0: dpaa2_rc_discover: failed to get object: idx=30, error=253
[2022-06-25 22:04:26.832] dpaa2_ni0: <DPAA2 Network Interface> dpio (id=5-7,0-4) dpbp (id=0) dpcon (id=21-28) dpmcp (id=4) at dpni (id=9) on dpaa2_rc0
[2022-06-25 22:04:26.870] dpaa2_ni0: connected to dpmac (id=7)
[2022-06-25 22:04:26.893] dpaa2_ni0: connected DPMAC is in FIXED mode
[2022-06-25 22:04:26.893] dpaa2_ni0: channels=8
[2022-06-25 22:04:27.612] dpni0: Ethernet address: 00:0a:fa:24:24:fd
[2022-06-25 22:04:27.612] dpaa2_ni1: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=1) dpcon (id=29-30,6-11) dpmcp (id=3) at dpni (id=8) on dpaa2_rc0
[2022-06-25 22:04:27.626] dpaa2_ni1: connected to dpmac (id=8)
[2022-06-25 22:04:27.649] dpaa2_ni1: connected DPMAC is in FIXED mode
[2022-06-25 22:04:27.649] dpaa2_ni1: channels=8
[2022-06-25 22:04:28.349] dpni1: Ethernet address: 00:0a:fa:24:24:fe
[2022-06-25 22:04:28.349] dpaa2_ni2: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=2) dpcon (id=12-19) dpmcp (id=2) at dpni (id=7) on dpaa2_rc0
[2022-06-25 22:04:28.363] dpaa2_ni2: connected to dpmac (id=9)
[2022-06-25 22:04:28.386] dpaa2_ni2: connected DPMAC is in FIXED mode
[2022-06-25 22:04:28.386] dpaa2_ni2: channels=8
[2022-06-25 22:04:29.058] dpni2: Ethernet address: 00:0a:fa:24:24:ff
[2022-06-25 22:04:29.086] dpaa2_ni3: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=3) dpcon (id=20,0-5,70) dpmcp (id=1) at dpni (id=6) on dpaa2_rc0
[2022-06-25 22:04:29.101] dpaa2_ni3: connected to dpmac (id=10)
[2022-06-25 22:04:29.123] dpaa2_ni3: connected DPMAC is in FIXED mode
[2022-06-25 22:04:29.123] dpaa2_ni3: channels=8
[2022-06-25 22:04:29.795] dpni3: Ethernet address: 00:0a:fa:24:25:00
[2022-06-25 22:04:29.795] dpaa2_ni4: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=4) dpcon (id=71-78) dpmcp (id=28) at dpni (id=5) on dpaa2_rc0
[2022-06-25 22:04:29.837] dpaa2_ni4: connected to dpmac (id=3)
[2022-06-25 22:04:29.837] dpaa2_ni4: connected DPMAC is in FIXED mode
[2022-06-25 22:04:29.859] dpaa2_ni4: channels=8
[2022-06-25 22:04:30.638] dpni4: Ethernet address: 00:0a:fa:24:25:01
[2022-06-25 22:04:30.638] dpaa2_ni5: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=5) dpcon (id=79,55-60,62) dpmcp (id=27) at dpni (id=4) on dpaa2_rc0
[2022-06-25 22:04:30.676] dpaa2_ni5: connected to dpmac (id=4)
[2022-06-25 22:04:30.676] dpaa2_ni5: connected DPMAC is in FIXED mode
[2022-06-25 22:04:30.676] dpaa2_ni5: channels=8
[2022-06-25 22:04:32.001] dpni5: Ethernet address: 00:0a:fa:24:25:02
[2022-06-25 22:04:32.001] dpaa2_ni6: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=6) dpcon (id=61,63-69) dpmcp (id=26) at dpni (id=3) on dpaa2_rc0
[2022-06-25 22:04:32.017] dpaa2_ni6: connected to dpmac (id=5)
[2022-06-25 22:04:32.039] dpaa2_ni6: connected DPMAC is in FIXED mode
[2022-06-25 22:04:32.039] dpaa2_ni6: channels=8
[2022-06-25 22:04:34.267] dpni6: Ethernet address: 00:0a:fa:24:25:03
[2022-06-25 22:04:34.267] dpaa2_ni7: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=7) dpcon (id=40-47) dpmcp (id=25) at dpni (id=2) on dpaa2_rc0
[2022-06-25 22:04:34.283] dpaa2_ni7: connected to dpmac (id=6)
[2022-06-25 22:04:34.305] dpaa2_ni7: connected DPMAC is in FIXED mode
[2022-06-25 22:04:34.305] dpaa2_ni7: channels=8
[2022-06-25 22:04:36.359] dpni7: Ethernet address: 00:0a:fa:24:25:04
[2022-06-25 22:04:36.359] dpaa2_ni8: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=8) dpcon (id=48-54,31) dpmcp (id=24) at dpni (id=1) on dpaa2_rc0
[2022-06-25 22:04:36.366] dpaa2_ni8: connected to dpmac (id=2)
[2022-06-25 22:04:36.391] dpaa2_ni8: connected DPMAC is in FIXED mode
[2022-06-25 22:04:36.391] dpaa2_ni8: channels=8
[2022-06-25 22:05:14.196] dpni8: Ethernet address: 00:0a:fa:24:25:06
[2022-06-25 22:05:14.196] dpaa2_ni9: <DPAA2 Network Interface> dpio (id=7,6,5,4,3,2,1,0) dpbp (id=9) dpcon (id=32-39) dpmcp (id=23) at dpni (id=0) on dpaa2_rc0
[2022-06-25 22:05:14.258] dpaa2_ni9: connected to dpmac (id=1)
[2022-06-25 22:05:14.258] dpaa2_ni9: connected DPMAC is in FIXED mode
[2022-06-25 22:05:14.258] dpaa2_ni9: channels=8
[2022-06-25 22:07:09.710] dpni9: Ethernet address: 00:0a:fa:24:25:05
[2022-06-25 22:07:09.786] gpioled0: <GPIO LEDs> on ofwbus0

Note the timestamps, the last couple of DPNIs take longer to enumerate.
Between dpaa2_ni8: channels=8 and dpni8: Ethernet address: 00:0a:fa:24:25:06 there is about a 40 second delay.
After dpaa2_ni9: channels=8 there is a delay of two minutes.

On one system it gets stuck at dpni7 (and gets stuck there), on the other it stalls for a minute at dpni9. (Coincidentally, dpni8 and 9 are the SFP ports)
I tried both MC firmware 10.20 (current Ten64 default) and 10.29.1 without any change in behaviour.

I bisected the issue to commits 19d8245 and 846462f

Commit e856e7a is the last good commit

dpaa2_mcp27 errors

dmesg -a | grep dpaa2_mcp27

dpaa2_mcp27: <DPAA2 MC portal> iomem 0x80c010000-0x80c01003f at dpmcp (id=1) on dpaa2_rc0
dpaa2_mcp27: dpaa2_mcp_attach: failed to reset DPMCP: id=1, error=6
device_attach: dpaa2_mcp27 attach returned 6
dpaa2_mcp27: <DPAA2 MC portal> iomem 0x80c010000-0x80c01003f at dpmcp (id=1) on dpaa2_rc0
dpaa2_rc0: resource entry 0 type 3 for child dpaa2_mcp27 is busy
dpaa2_mcp27: dpaa2_mcp_attach: failed to allocate resources
device_attach: dpaa2_mcp27 attach returned 6
dpaa2_mcp27: <DPAA2 MC portal> iomem 0x80c010000-0x80c01003f at dpmcp (id=1) on dpaa2_rc0
dpaa2_rc0: resource entry 0 type 3 for child dpaa2_mcp27 is busy
dpaa2_mcp27: dpaa2_mcp_attach: failed to allocate resources
device_attach: dpaa2_mcp27 attach returned 6

Seen this on the LS1088/FDT but never on the LX2160/ACPI before (not sure if FDT/ACPI are relevant).
Opening to keep pan eye on it and debug and not forget.

panic under heavy network load

this only reproduces when more than usual cross-dpaa interface traffic is present.
I can trigger it using iperf3 reliably. This is using normal CURRENT, not fork.

$ iperf3 --parallel 16 --client 172.16.2.24  --get-server-output --time 120
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec  2.37 MBytes  19.7 Mbits/sec  149   42.1 KBytes
[  7]   0.00-1.01   sec  2.39 MBytes  19.9 Mbits/sec  133   5.28 KBytes
[  9]   0.00-1.01   sec  3.26 MBytes  27.1 Mbits/sec  210   44.8 KBytes
[ 11]   0.00-1.01   sec  2.21 MBytes  18.4 Mbits/sec   53   2.63 KBytes
[ 13]   0.00-1.01   sec  2.28 MBytes  19.0 Mbits/sec  114   1.32 KBytes
[ 15]   0.00-1.01   sec  2.33 MBytes  19.3 Mbits/sec   75   2.66 KBytes
[ 17]   0.00-1.01   sec  2.53 MBytes  21.0 Mbits/sec  201   86.9 KBytes
[ 19]   0.00-1.01   sec  2.58 MBytes  21.5 Mbits/sec  119   90.8 KBytes
[ 21]   0.00-1.01   sec  2.55 MBytes  21.2 Mbits/sec  102    134 KBytes
[ 23]   0.00-1.01   sec  2.61 MBytes  21.7 Mbits/sec   31   1.32 KBytes
[ 25]   0.00-1.01   sec  2.27 MBytes  18.9 Mbits/sec   76   1.32 KBytes
[ 27]   0.00-1.01   sec  2.53 MBytes  21.1 Mbits/sec   88    147 KBytes
[ 29]   0.00-1.01   sec  2.48 MBytes  20.6 Mbits/sec   94   1.32 KBytes
[ 31]   0.00-1.01   sec  2.54 MBytes  21.1 Mbits/sec   23   1.32 KBytes
[ 33]   0.00-1.01   sec  2.56 MBytes  21.3 Mbits/sec  113   47.4 KBytes
[ 35]   0.00-1.01   sec  2.48 MBytes  20.7 Mbits/sec   92    161 KBytes
[SUM]   0.00-1.01   sec  40.0 MBytes   333 Mbits/sec  1673
  x0:                0
  x1: ffff00010d052200
  x2: ffff0000009e0078 (console_pausestr + 25688)
  x3:              30d
  x4:                0
  x5:                d
  x6:  a009b033eaa2d8c
  x7:    8172b24fa0a00
  x8: ffff000114a9d000
  x9:                0
 x10:                1
 x11:                3
 x12:                1
 x13:                0
 x14:            10000
 x15:                1
 x16:            10000
 x17: ffff0001737b1958 (ng_unref_node + 0)
 x18: ffff00010e4806d0
 x19: ffff0001146d9000
 x20: ffff0001146d9058
 x21: ffffa000024e1200
 x22:                0
 x23: ffff00010e480750
 x24: ffffa000031db680
 x25: ffffa000031c2c00
 x26: ffff000000cac018 (Giant + 18)
 x27: ffff000000961b27 (digits + 227b9)
 x28: ffffa000031c2c10
 x29: ffff00010e4806d0
  sp: ffff00010e4806d0
  lr: ffff00000081afc0 (dpaa2_ni_poll + 3c)
 elr: ffff00000081b028 (dpaa2_ni_poll + a4)
spsr:         40000045
 far:                1
 esr:         96000004
panic: vm_fault failed: ffff00000081b028 error 1
cpuid = 7
time = 1679392819
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
data_abort() at data_abort+0x32c
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0
(null)() at 0x10000
KDB: enter: panic
[ thread pid 12 tid 100119 ]
Stopped at      kdb_enter+0x44: undefined       f906427f
db>
Tracing pid 12 tid 100119 td 0xffff00010d052200
db_trace_self() at db_trace_self
db_stack_trace() at db_stack_trace+0x11c
db_command() at db_command+0x2d8
db_command_loop() at db_command_loop+0x54
db_trap() at db_trap+0xf8
kdb_trap() at kdb_trap+0x28c
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0
(null)() at 0
db>
  • output of while true; vmstat -i | grep dpaa2_io; sleep 1; end
  • and top -SjwHPz -mcpu at moment of crash (tmux over mosh)
its0,43: dpaa2_io0                                 25732021        399
its0,44: dpaa2_io1                                  4262516         66
its0,45: dpaa2_io2                                  4645361         72
its0,46: dpaa2_io3                                  4869407         76
its0,47: dpaa2_io4                                  4506983         70
its0,48: dpaa2_io5                                  4257987         66
its0,49: dpaa2_io6                                  3052330         47
its0,50: dpaa2_io7                                  2853862         44


436 threads:   9 running, 373 sleeping, 54 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 2:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 3:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU 4:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU 5:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 6:  0.0% user,  0.0% nice,  100% system,  0.0% interrupt,  0.0% idle
CPU 7:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
Mem: 122M Active, 5184M Inact, 4528M Wired, 40K Buf, 21G Free
ARC: 2678M Total, 388M MFU, 1623M MRU, 1153K Anon, 53M Header, 607M Other
     1706M Compressed, 3914M Uncompressed, 2.29:1 Ratio
Swap: 4096M Total, 4096M Free

  PID   JID USERNAME2   PRI NICE   SIZE    RES SWAP STATE    C   TIME    WCPU COMMAND
21012     0 root         21    0    17M  4516K   0B CPU3     3   0:00 100.00% top
52722     0 dch          20    0    28M    17M   0B select   4   0:02  44.43% mosh-server
   12     0 root        -64    -     0B   736K   0B WAIT     4   4:24  21.61% intr{its0,46: dpaa2_io3}
   12     0 root        -64    -     0B   736K   0B WAIT     7   7:47  21.43% intr{its0,43: dpaa2_io0}
85664     0 dch          20    0    14M  5016K   0B select   1   0:01  13.89% tmux
   12     0 root        -64    -     0B   736K   0B WAIT     5   4:12  13.20% intr{its0,45: dpaa2_io2}
   12     0 root        -64    -     0B   736K   0B WAIT     0   2:37  10.00% intr{its0,50: dpaa2_io7}
   12     0 root        -64    -     0B   736K   0B WAIT     2   3:33   9.24% intr{its0,48: dpaa2_io5}
   12     0 root        -64    -     0B   736K   0B WAIT     3   4:37   8.09% intr{its0,47: dpaa2_io4}
   12     0 root        -64    -     0B   736K   0B WAIT     6   3:42   3.56% intr{its0,44: dpaa2_io1}
    0     0 root        -64    -     0B  2400K   0B -        1   0:14   2.45% kernel{dpaa2_ni1_tqbp}
59646     1    317       20    0   363M   128M   0B kqread   6   2:27   1.19% node{node}
    0     0 root        -64    -     0B  2400K   0B -        2   0:17   0.38% kernel{dpaa2_ni2_tqbp}
20183     0 root         20    0    33M    14M   0B select   6   2:39   0.00% zerotier-one{zerotier-one}
   12     0 root        -64    -     0B   736K   0B WAIT     1   2:38   0.00% intr{its0,49: dpaa2_io6}
44098     0 root         20    0  1674M   537M   0B kqread   2   2:05   0.00% kresd
    6     0 root         -8    -     0B  1360K   0B tx->tx   1   1:54   0.00% zfskern{txg_thread_enter}
    2     0 root        -60    -     0B   128K   0B WAIT     1   1:44   0.00% clock{clock (0)}
19910     0 root        -16    -     0B    16K   0B pftm     1   1:38   0.00% pf purge

dpaa2_ni_rx panic: dpaa2_ni_rx: unexpected frame buffer fd_addr != buf_paddr

Commit: 173aa2a

I ran into this twice when running my stresstest for long periods of time (>1 hour)

panic: dpaa2_ni_rx: unexpected frame buffer: fd_addr(0x305800008c900000) != buf_paddr(0x3058000088ccf000)
cpuid = 5
time = 1652662301
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
kdb_backtrace() at kdb_backtrace+0x38
vpanic() at vpanic+0x17c
panic() at panic+0x44
dpaa2_ni_rx() at dpaa2_ni_rx+0x26c
dpaa2_ni_poll_task() at dpaa2_ni_poll_task+0x1b0
taskqueue_run_locked() at taskqueue_run_locked+0xac
taskqueue_thread_loop() at taskqueue_thread_loop+0xc8
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 0 tid 100118 ]
Stopped at      kdb_enter+0x40: undefined       f902027f

First crash trace;

(kgdb) frame 3
#3  0xffff0000007d3394 in dpaa2_ni_rx (chan=0xffff0000fd6f8000, fq=<optimized out>, fd=0xffff0000fda42020) at /usr/src/freebsd-src/sys/dev/dpaa2/dpaa2_ni.c:2630
2630            KASSERT(paddr == buf->paddr, ("%s: unexpected frame buffer: "
(kgdb) info locals
released = {0, 0, 0, 8589934592, 18446462598741807642, 18446462602928396336, 18446462598741044400}
ifp = <optimized out>
sc = <optimized out>
paddr = 3483534314129326080
released_n = 0
buf = <optimized out>
buf_chan = 0xec36f06f7149058a
buf_idx = <optimized out>
m = <optimized out>
buf_len = <optimized out>
buf_data = <optimized out>
error = <optimized out>
bp_dev = <optimized out>
bpsc = <optimized out>
chan_idx = <optimized out>
(kgdb) frame 4
#4  0xffff0000007d2da8 in dpaa2_ni_consume_frames (chan=0xffff0000fd6f8000, src=<optimized out>, consumed=<optimized out>) at /usr/src/freebsd-src/sys/dev/dpaa2/dpaa2_ni.c:2568
2568                                    fq->consume(chan, fq, fd);
(kgdb) info locals
retries = <optimized out>
fq = 0x80
rc = 36
dq = 0xffff0000fda42000
fd = 0xffff0000008c5b8a
frames = <optimized out>
(kgdb) print *dq
$4 = {{common = {verb = 96 '`', _reserved = "\223\000\000\000\000\000̖", '\000' <repeats 16 times>, "\321s\375\000\000\377\377\000P\347\070\202\000\024\000\352\005\000\000\000\000\300\000\000\200\000 \000\000\200\a\000\000\000\000\000\000\000"}, fdr = {desc = {
        verb = 96 '`', stat = 147 '\223', seqnum = 0, oprid = 0, _reserved = 0 '\000', tok = 204 '\314', fqid = 150, _reserved1 = 0, fq_byte_cnt = 0, fq_frm_cnt = 0, fqd_ctx = 18446462602985066752}, fd = {addr = 5630058834644992, data_length = 1514, bpid_ivp_bmt = 0,
        offset_fmt_sl = 192, frame_ctx = 536903680, ctrl = 125829120, flow_ctx = 0}}, scn = {verb = 96 '`', stat = 147 '\223', state = 0 '\000', _reserved = 0 '\000', rid_tok = 3422552064, ctx = 150}}}
(kgdb) print *fd
$5 = {addr = 7165916604720706863, data_length = 1701996079, bpid_ivp_bmt = 25189, offset_fmt_sl = 25715, frame_ctx = 1668444973, ctrl = 1937339183, flow_ctx = 7307986971750918959}
(kgdb) print *fq
Cannot access memory at address 0x80
(kgdb) print fd
$6 = (struct dpaa2_fd *) 0

Second time:

#4  0xffff0000007d2da8 in dpaa2_ni_consume_frames (chan=0xffff0000fc616000, src=<optimized out>, consumed=<optimized out>) at /usr/src/freebsd-src/sys/dev/dpaa2/dpaa2_ni.c:2568
2568                                    fq->consume(chan, fq, fd);
(kgdb) info locals
retries = <optimized out>
fq = 0x80
rc = 36
dq = 0xffff0000fcc58000
fd = 0xffff0000008c5b8a
frames = <optimized out>
(kgdb) print *dq
$1 = {{common = {verb = 96 '`', _reserved = "\022\000\000\000\000\000̵", '\000' <repeats 16 times>, "\215a\374\000\000\377\377\000\000=\214\000\000\270qB\000\000\000\000\000\300@\000\240\000 \000\000\001\000\000\000\000\000\000\000\000"}, fdr = {desc = {verb = 96 '`',
        stat = 18 '\022', seqnum = 0, oprid = 0, _reserved = 0 '\000', tok = 204 '\314', fqid = 181, _reserved1 = 0, fq_byte_cnt = 0, fq_frm_cnt = 0, fqd_ctx = 18446462602967092480}, fd = {addr = 8194299524353425408, data_length = 66, bpid_ivp_bmt = 0,
        offset_fmt_sl = 16576, frame_ctx = 536911872, ctrl = 65536, flow_ctx = 0}}, scn = {verb = 96 '`', stat = 18 '\022', state = 0 '\000', _reserved = 0 '\000', rid_tok = 3422552064, ctx = 181}}}
(kgdb) print *fd
$2 = {addr = 7165916604720706863, data_length = 1701996079, bpid_ivp_bmt = 25189, offset_fmt_sl = 25715, frame_ctx = 1668444973, ctrl = 1937339183, flow_ctx = 7307986971750918959}

Expensive callout(9) function: dpaa2_ni_media_tick

[7.797840] Expensive callout(9) function: 0xffff000000c486a0(0xffffa00000be4000) 0.005835388 s

grep ffff000000c486a0 kernel.full.nm
ffff000000c486a0 t dpaa2_ni_media_tick

I should have a look. Just opening here so we can track it.

Panic in dpaa2_ni_poll_task

Commit: 0e7b9be

This happened to me just now. Same environment as #3.

I'm not sure if the textdump is helpful or not. I will try and setup my install to properly generate a minidump


Fatal data abort:
  x0: ffff00011cd4c000
  x1:                0
  x2: ffff00011d094020
  x3:              152
  x4:                0
  x5:                d
  x6: 9c58ea23fe5fba08
  x7:    8ebb800fc9c58
  x8: ffff00011d094000
  x9: ffff00011d094000
 x10:                3
 x11:                3
 x12:                0
 x13:                0
 x14:                0
 x15:                8
 x16: ffff00016862ad48 (__stop_set_modmetadata_set + 448)
 x17: ffff00000059264c (if_inc_counter + 0)
 x18: ffff00010e3b1840
 x19: ffff00011cd4c000
 x20: ffff00011c9f1000
 x21: ffffa00002d4ec00
 x22: ffff00000091a4b3 (console_pausestr + 366c)
 x23: ffff0000008c3829 (do_execve.fexecv_proc_title + 26782)
 x24: ffff000000bc1000 (sccs + 8)
 x25:                1
 x26:               25
 x27: ffff000000e71000 (epoch_array + 1e00)
 x28: ffff000000e1c0e0 (thread0_st + 0)
 x29: ffff00010e3b1840
  sp: ffff00010e3b1840
  lr: ffff0000007e83d8 (dpaa2_ni_poll_task + c0)
 elr: ffff0000007e84b4 (dpaa2_ni_poll_task + 19c)
spsr:         60000045
 far:                0
 esr:         96000004
panic: vm_fault failed: ffff0000007e84b4 error 1
cpuid = 3
time = 1651998794
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x174
panic() at panic+0x44
data_abort() at data_abort+0x2f0
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0x96000004
dpaa2_ni_poll_task() at dpaa2_ni_poll_task+0x19c
taskqueue_run_locked() at taskqueue_run_locked+0x178
taskqueue_thread_loop() at taskqueue_thread_loop+0xc8
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic

kernel panics

Hello,

Recently, my ten64 started crashing multiple times per day.

dpni9 and dpni8 are 10Gb nics; in my case, dpni9 is facing internet while dpni8 serves my local network.
Plus, I use vlans, mostly for vmware virtual machines.

Communication between internet and my local network works pretty fine, but

I tried different scenarios:

  1. dpni8 form my local network + vlans
    As soon as VM ( so in a vlan) begins to communicate with any other network, the ten64 crashes:
 Fatal data abort:
  x0: ffffa0001f96d800
  x1:                0
  x2:                2
  x3:       80893c5000
  x4: ffff0000008d26ac (generic_bs_w_4 + 0)
  x5: ffff0000f85cb820 (_DYNAMIC + f6bba868)
  x6:                0
  x7:                0
  x8:             40c0
  x9: ffff000161bc50c0 (__stop_set_sysinit_set + 1180d20)
 x10:              5ea
 x11:              5ea
 x12:                1
 x13:             2af8
 x14:               12
 x15:             2af8
 x16:             28b2
 x17:             28b1
 x18: ffff0000f85cb6a0 (_DYNAMIC + f6bba6e8)
 x19: ffff00011474b000
 x20: ffff000112429000
 x21: ffff00011474b100
 x22: ffff000114e07020
 x23: ffffa0001f96d800
 x24:                0
 x25: ffff000112459520
 x26:                0
 x27: ffff000000d9c618 (Giant + 18)
 x28: ffffa000058d9a80
 x29: ffff0000f85cb6e0 (_DYNAMIC + f6bba728)
  sp: ffff0000f85cb6a0
  lr: ffff00000092be84 (dpaa2_ni_rx + f0)
 elr: ffff00000092beb0 (dpaa2_ni_rx + 11c)
spsr:               45
 far:               10
 esr:         96000044
panic: vm_fault failed: ffff00000092beb0 error 1
cpuid = 0
time = 1686765529
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
data_abort() at data_abort+0x308
handle_el1h_sync() at handle_el1h_sync+0x14
--- exception, esr 0x96000044
dpaa2_ni_rx() at dpaa2_ni_rx+0x11c
dpaa2_ni_poll() at dpaa2_ni_poll+0x84
dpaa2_io_intr() at dpaa2_io_intr+0x16c
ithread_loop() at ithread_loop+0x3fc
fork_exit() at fork_exit+0x88
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 12 tid 100126 ]
Stopped at      kdb_enter+0x44: str     xzr, [x19, #1152]
  1. dpni 8 form local network and dpni5 for vlans
    This is a less unstable configuration, but it eventually crashes anyway:
panic: dpaa2_ni_rx: unexpected physical address: fd(0xa2ec8000) != buf(0xa3564000)
cpuid = 7
time = 1686903243
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
dpaa2_ni_rx() at dpaa2_ni_rx+0x2a0
dpaa2_ni_poll() at dpaa2_ni_poll+0x84
dpaa2_io_intr() at dpaa2_io_intr+0x16c
ithread_loop() at ithread_loop+0x3fc
fork_exit() at fork_exit+0x88
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 12 tid 100119 ]
Stopped at      kdb_enter+0x44: str     xzr, [x19, #1152]
  1. using 13.2
    It quickly crashes, even without activity
Fatal data abort:
  x0:                0
  x1:                0
  x2:                0
  x3:               40
  x4:               3f
  x5: ffff0000e6cf7000
  x6:  a000cfea4bfa322
  x7: 20450008c22724fa
  x8: ffff000000f34000
  x9: ffffa00000000000
 x10: 7784f7f643fcc836
 x11: 3fc367d3c6c5532f
 x12: 4eaef46f559c1721
 x13: d038e45d25e875d2
 x14: 927044e716003300
 x15: 188034c38f645e30
 x16:  1010000f61e2170
 x17: e73c84cf86bd0a08
 x18: ffff00015d2fded0
 x19: ffffa00002270100
 x20: ffffa0002114dc00
 x21: ffffa0001942f400
 x22: ffffa00040a6a000
 x23:                4
 x24: ffff0001137e8000
 x25: ffff000113899100
 x26: ffff00011389a340
 x27: ffff000113899120
 x28:               34
 x29: ffff00015d2fded0
  sp: ffff00015d2fded0
  lr: ffff0000007cf904
 elr: ffff0000007eaeec
spsr:         80000045
 far:               30
 esr:         96000004
panic: vm_fault failed: ffff0000007eaeec
cpuid = 2
time = 1686762562
KDB: stack backtrace:
#0 0xffff0000004fd02c at kdb_backtrace+0x60
#1 0xffff0000004a8328 at vpanic+0x13c
#2 0xffff0000004a81e8 at panic+0x44
#3 0xffff0000007f42e0 at data_abort+0x200
#4 0xffff0000007d3010 at handle_el1h_sync+0x10
#5 0xffff0000007cf900 at bounce_bus_dmamap_sync+0x74
#6 0xffff0000007cf900 at bounce_bus_dmamap_sync+0x74
#7 0xffff00000081b0dc at dpaa2_ni_transmit+0x3c4
#8 0xffff0000005df3c4 at ether_output_frame+0xd4
#9 0xffff0000005df200 at ether_output+0x664
#10 0xffff00000063b258 at ip_output+0x1320
#11 0xffff000000655ab4 at tcp_output+0x1e8c
#12 0xffff00000066b788 at tcp_usr_send+0x1f4
#13 0xffff000000557c4c at sosend_generic+0x598
#14 0xffff000000558364 at sosend+0x3c
#15 0xffff00000052c978 at soo_write+0x44
#16 0xffff000000521fb0 at dofilewrite+0x7c
#17 0xffff000000521a98 at sys_write+0xb8

Tell me if you need more information or tests from me. My build machine is fast so it does not bother me to compile multiple times.

[ten64 branch] Dataflow stops with multiple port traffic

Commit: 5f6b8b3

In my test suite, I run iperf3 through the FreeBSD host functioning as a router

For example:
iperf3 server <-> dpniX (FreeBSD) dpniX+1 <-> iperf3 client

The test system is another Ten64 running Linux which runs each iperf3 instance in a container with one of the ethX ports transferred into it.

So dpni0 on FreeBSD -> eth0 on test system, dpni1<->eth1, dpni2<->eth2 etc.

(I'll publish the scripts another time, they need a bit of cleanup)

cat /etc/rc.conf
hostname="freebsd-ten64"
ifconfig_dpni0="192.168.13.1 netmask 255.255.255.0"
ifconfig_dpni1="192.168.14.1 netmask 255.255.255.0"
ifconfig_dpni2="192.168.15.1 netmask 255.255.255.0"
ifconfig_dpni3="192.168.16.1 netmask 255.255.255.0"
ifconfig_dpni6="DHCP inet6 accept_rtadv"
growfs_enable="YES"
dhcpd_enable="YES"                          # dhcpd enabled?
dhcpd_flags="-q"                            # command option(s)
dhcpd_conf="/usr/local/etc/dhcpd.conf"      # configuration file
dhcpd_ifaces="dpni1 dpni3"                  # ethernet interface(s)
dhcpd_withumask="022"                       # file creation mask
gateway_enable="YES"
sshd_enable="YES"

dpni6 is the interface to my LAN for management

Server 1 is attached to dpni0 on 192.168.13.2
Client 1 is on dpni2, gets an IP via DHCP and initiates an iperf3 -R -c 192.168.13.2
Server 2 on 192.168.15.2, Client 2 on 192.168.16.X so on.

For this initial test, I will run just one flow.

On this branch, the dataflow completely stops almost immediately:

udhcpc: started, v1.34.1
udhcpc: broadcasting discover                                                   
udhcpc: broadcasting select for 192.168.14.10, server 192.168.14.1
udhcpc: lease of 192.168.14.10 obtained from 192.168.14.1, lease time 600
Connecting to host 192.168.13.2, port 5201
Reverse mode, remote host 192.168.13.2 is sending
[  5] local 192.168.14.10 port 53100 connected to 192.168.13.2 port 5201
[ ID] Interval           Transfer     Bitrate                                   
[  5]   0.00-1.00   sec  14.1 KBytes   116 Kbits/sec
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec

In this case, dpni1 won't receive any traffic to the iperf3 server (192.168.13.2), but will receive other frames:


root@freebsd-ten64:/dev # tcpdump -i dpni1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on dpni1, link-type EN10MB (Ethernet), capture size 262144 bytes
07:03:08.616665 ARP, Request who-has 192.168.14.1 tell 192.168.14.10, length 46
07:03:32.278905 IP 192.168.14.10 > 192.168.13.1: ICMP echo request, id 9, seq 0, length 64
# 192.168.13.2 missing
07:03:34.073692 IP 192.168.14.10 > 192.168.13.3: ICMP echo request, id 11, seq 0, length 64
07:03:34.858869 IP 192.168.14.10 > 192.168.13.4: ICMP echo request, id 12, seq 0, length 64
07:03:35.533436 IP 192.168.14.10 > 192.168.13.5: ICMP echo request, id 13, seq 0, length 64
07:03:36.293349 IP 192.168.14.10 > 192.168.13.6: ICMP echo request, id 14, seq 0, length 64
07:03:37.348211 IP 192.168.14.10 > 192.168.13.7: ICMP echo request, id 15, seq 0, length 64
07:03:38.298136 IP 192.168.14.10 > 192.168.13.8: ICMP echo request, id 16, seq 0, length 64
07:03:39.223962 IP 192.168.14.10 > 192.168.13.9: ICMP echo request, id 17, seq 0, length 64
# 192.168.13.10 MISSING
07:03:41.163779 IP 192.168.14.10 > 192.168.13.11: ICMP echo request, id 19, seq 0, length 64
07:03:42.098482 IP 192.168.14.10 > 192.168.13.12: ICMP echo request, id 20, seq 0, length 64
07:03:42.853497 IP 192.168.14.10 > 192.168.13.13: ICMP echo request, id 21, seq 0, length 64
# Other IPs not being received: 192.168.13.17, 26, 29, so looks like one of the queues is not processing

vmstat:

vmstat -i | grep dpaa2
its0,140: dpaa2_io0                                      17          0
its0,141: dpaa2_io1                                     798          2
its0,142: dpaa2_io2                                      13          0
its0,143: dpaa2_io3                                      53          0
its0,144: dpaa2_io4                                       4          0
its0,145: dpaa2_io5                                       4          0
its0,146: dpaa2_io6                                      21          0
its0,147: dpaa2_io7                                      38          0
its0,148: dpaa2_ni0                                       1          0
its0,149: dpaa2_ni1                                       1          0
its0,150: dpaa2_ni2                                       1          0
its0,151: dpaa2_ni3                                       1          0
its0,154: dpaa2_ni6                                       1          0

I do a few more vmstats:

its0,140: dpaa2_io0                                      17          0
its0,141: dpaa2_io1                                    1429          1
its0,142: dpaa2_io2                                      25          0
its0,143: dpaa2_io3                                     104          0
its0,144: dpaa2_io4                                      11          0
its0,145: dpaa2_io5                                      16          0
its0,146: dpaa2_io6                                      35          0
its0,147: dpaa2_io7                                      56          0
its0,148: dpaa2_ni0                                       1          0
its0,149: dpaa2_ni1                                       1          0
its0,150: dpaa2_ni2                                       1          0
its0,151: dpaa2_ni3                                       1          0
its0,154: dpaa2_ni6                                       1          0

its0,140: dpaa2_io0 counter has not changed, is it stuck?

dpaa2 niX counters:

sysctl dev.dpaa2_ni.0
dev.dpaa2_ni.0.stats.in_all_frames: 66
dev.dpaa2_ni.0.stats.in_all_bytes: 62368
dev.dpaa2_ni.0.stats.in_multi_frames: 0
dev.dpaa2_ni.0.stats.eg_all_frames: 36
dev.dpaa2_ni.0.stats.eg_all_bytes: 2471
dev.dpaa2_ni.0.stats.eg_multi_frames: 0
dev.dpaa2_ni.0.stats.in_filtered_frames: 0
dev.dpaa2_ni.0.stats.in_discarded_frames: 0
dev.dpaa2_ni.0.stats.in_nobuf_discards: 0
dev.dpaa2_ni.0.stats.buf_free: 0
dev.dpaa2_ni.0.stats.buf_num: 2800
dev.dpaa2_ni.0.%parent: dpaa2_rc0
dev.dpaa2_ni.0.%pnpinfo:
dev.dpaa2_ni.0.%location:
dev.dpaa2_ni.0.%driver: dpaa2_ni
dev.dpaa2_ni.0.%desc: DPAA2 Network Interface
root@freebsd-ten64:/dev # sysctl dev.dpaa2_ni.1
dev.dpaa2_ni.1.stats.in_all_frames: 584
dev.dpaa2_ni.1.stats.in_all_bytes: 56208
dev.dpaa2_ni.1.stats.in_multi_frames: 0
dev.dpaa2_ni.1.stats.eg_all_frames: 195
dev.dpaa2_ni.1.stats.eg_all_bytes: 32414
dev.dpaa2_ni.1.stats.eg_multi_frames: 0
dev.dpaa2_ni.1.stats.in_filtered_frames: 0
dev.dpaa2_ni.1.stats.in_discarded_frames: 0
dev.dpaa2_ni.1.stats.in_nobuf_discards: 0
dev.dpaa2_ni.1.stats.buf_free: 0
dev.dpaa2_ni.1.stats.buf_num: 2800
dev.dpaa2_ni.1.%parent: dpaa2_rc0
dev.dpaa2_ni.1.%pnpinfo:
dev.dpaa2_ni.1.%location:
dev.dpaa2_ni.1.%driver: dpaa2_ni
dev.dpaa2_ni.1.%desc: DPAA2 Network Interface
root@freebsd-ten64:/dev # sysctl dev.dpaa2_ni.6
dev.dpaa2_ni.6.stats.in_all_frames: 1103
dev.dpaa2_ni.6.stats.in_all_bytes: 95670
dev.dpaa2_ni.6.stats.in_multi_frames: 667
dev.dpaa2_ni.6.stats.eg_all_frames: 10
dev.dpaa2_ni.6.stats.eg_all_bytes: 978
dev.dpaa2_ni.6.stats.eg_multi_frames: 0
dev.dpaa2_ni.6.stats.in_filtered_frames: 2
dev.dpaa2_ni.6.stats.in_discarded_frames: 0
dev.dpaa2_ni.6.stats.in_nobuf_discards: 0
dev.dpaa2_ni.6.stats.buf_free: 0
dev.dpaa2_ni.6.stats.buf_num: 2800
dev.dpaa2_ni.6.%parent: dpaa2_rc0
dev.dpaa2_ni.6.%pnpinfo:
dev.dpaa2_ni.6.%location:
dev.dpaa2_ni.6.%driver: dpaa2_ni
dev.dpaa2_ni.6.%desc: DPAA2 Network Interface

Many dpaa2_niX: dpaa2_ni_transmit: drbr_enqueue() failed errors

Hardware: Ten64, MC 10.20 (default DPAA2 configuration)
Commit: 0e7b9be

I've had a Ten64 with FreeBSD serving my audiovisual equipment (TV, set-top box, Xbox) for a week now, it has worked well.

I have noticed these messages appearing in syslog when multiple devices are doing traffic:

May  6 21:52:03 ten64-freebsd kernel: dpaa2_ni2: dpaa2_ni_poll_task: failed to pull frames: chan_id=54, error=16
May  7 03:55:43 ten64-freebsd kernel: dpaa2_ni2: dpaa2_ni_poll_task: failed to pull frames: chan_id=54, error=16
May  7 08:02:16 ten64-freebsd kernel: dpaa2_ni1: dpaa2_ni_transmit: drbr_enqueue() failed
May  7 08:02:16 ten64-freebsd syslogd: last message repeated 5 times
May  7 08:03:46 ten64-freebsd syslogd: last message repeated 8 times
May  7 08:18:06 ten64-freebsd kernel: dpaa2_ni1: dpaa2_ni_transmit: drbr_enqueue() failed
May  7 08:18:17 ten64-freebsd syslogd: last message repeated 23 times
May  8 03:42:54 ten64-freebsd syslogd: last message repeated 1 times
May  8 03:55:20 ten64-freebsd kernel: dpaa2_ni1: dpaa2_ni_transmit: drbr_enqueue() failed
May  8 03:55:20 ten64-freebsd syslogd: last message repeated 7 times
May  8 04:36:25 ten64-freebsd kernel: dpaa2_ni1: dpaa2_ni_transmit: drbr_enqueue() failed
May  8 04:36:25 ten64-freebsd syslogd: last message repeated 13 times

It doesn't happen consistently but enough over two separate occasions.
No major effects (e.g dropped or degraded video streams) are visible on the connected devices

The configuration is:

  • dpni1..3 (GE0..3) bridged as bridge0. Devices are connected on each of these interfaces.
ifconfig bridge0 create
ifconfig bridge0 inet 192.168.13.1/24
ifconfig bridge0 addm dpni0 addm dpni1 addm dpni2 addm dpni3
ifconfig bridge0 up
  • dpni6 (GE6) as 'WAN' interface
  • pf acting as NAT firewall between bridge0 and dpni6

I will try and construct a more repeatable testcase.

ether_nh_input: no mbuf packet header!

I've stumbled upon this panic during a network load test:

# uname -apKU
FreeBSD guardian 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 aarch64 1400096 #1 main-n264908-58983e4b0253: Sun Aug 20 12:31:18 CEST 2023     [email protected]:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64 aarch64 1400096 1400096
...
panic: ether_nh_input: no mbuf packet header!
cpuid = 4
time = 14861
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x19c
panic() at panic+0x44
ether_nh_input() at ether_nh_input+0x460
netisr_dispatch_src() at netisr_dispatch_src+0xe0
ether_input() at ether_input+0xa0
dpaa2_ni_rx() at dpaa2_ni_rx+0x1fc
dpaa2_ni_cleanup_task() at dpaa2_ni_cleanup_task+0x174
taskqueue_run_locked() at taskqueue_run_locked+0x17c
taskqueue_thread_loop() at taskqueue_thread_loop+0xc8
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 0 tid 100134 ]
Stopped at      kdb_enter+0x44: str     xzr, [x19, #768]

I haven't ever seen it before.

ten64: No dataflow on boot until cable replugged

FYI: I am seeing this issue (as of a85d6c9) as well. It may have been around for longer, but hard to tell apart from the other dataflow issues,

No dataflow (in ten64's 'managed' mode) until I remove and reconnect Ethernet cables.
If I have some time on the weekend, I will try and do some debugging

          I'm not sure whether it's the same issue, but I can't do any traffic after booting until I replug the network cables. If I just unplug them and immediately plug in, traffic starts working.

Originally posted by @pkubaj in #18 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.