cisco / exanic-software Goto Github PK
View Code? Open in Web Editor NEWExaNIC drivers, utilities and development libraries
License: Other
ExaNIC drivers, utilities and development libraries
License: Other
Hello,
As it currently is, the repository will return a 200 OK
for any request, whether it exists or not, then serve a page displaying a 404
, instead of servicing the return code. When mirroring the repository it causes issues when searching for a treeinfo file or similar as it sees the 200 OK
and attempts to pull - then fails as it isn't present.
$ curl -s "http://exablaze.com/downloads/yum/redhat/el7/x86_64/repodata/treeinfo" -ILk | grep HTTP
HTTP/1.1 302 Found
HTTP/1.1 302 Found
HTTP/1.1 200 OK
$ curl -s "http://exablaze.com/downloads/yum/redhat/el7/x86_64/repodata/treeinfo" -Lk | pup 'title:contains("404") text{}'
Exablaze - Error 404 - Page Not Found
I'm unsure if this is the best place to reach out regarding the issue, let me know if this should be posted elsewhere! Thank you
Hello all,
What are the remediation and a cause of subject error?
I am trivially using
taskset 4444 exasock
Hadware is ExaNic X10
exanic is 2.5.0-1.e17
CentOs 7.5.1804
Thank you!
Update:
I reproduced issue with application that has 2 accelerated sockets:
read and read/write.
I start multiple instances on same multicast groups with follows results:
first 6 instances have no issues.
7th generates error inside bind.
[pid 23731] socket(AF_INET, 2050, 0) = 15
[pid 23731] setsockopt(15, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 23731] setsockopt(15, SOL_SOCKET, SO_RCVBUF, [4194304], 4) = 0
[pid 23731] bind(15, {sa_family=AF_INET, sin_port=htons(), sin_addr=inet_addr("0.0.0.0")}, 16
exasock warning: setting of SO_RCVBUF on accelerated socket is not effective
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
) = 0
Ignore SO_RCVBUF warning it is noop operation.
8-34 instances run without errors.
If I kill 7th instance and start it again, it generates exact same failure.
Update 2:
Firmware ver 21 solved problem of 7th instance. With ver 21 all processes after 7th are producing the error.
Update 3:
Solution:
Following code change eliminates exanic_acquire_tx_buffer in multicast reader.
otherwise exasock acquire tx buffer for multicast read on every NIC on the box.
Suppose following case:
Server wants to send a package of 1490
, which is split into 1460+30
.
Client
recv
socket with flag MSG_WAITALL , and provide a length of 298recv
will return with a length of 298recv
will return 268 , even if the last 30 bytes will be received in following packageIs the case above possible ? How to deal with it ?
Hi,
After closing a TCP socket it goes to exasock_tcp_free
where synchronize_rcu
is eventually called in gc worker. Since it is basically a sleep(N_ms)
and sockets are processed one by one in gc workqueue, this puts a (rather low) hard limit on number of sockets destroyed per second. If you create and close too much sockets per second you will eventually run out of memory.
I have a patch which creates one more gc workqueue, it works for me but I'm not sure about its quality. Also, this problem may be also present in UDP sockets, exasock_udp_free
uses synchronize_rcu
but I didn't test it.
diff --git a/modules/exasock/exasock-tcp.c b/modules/exasock/exasock-tcp.c
index 263b2e6..69bb947 100644
--- a/modules/exasock/exasock-tcp.c
+++ b/modules/exasock/exasock-tcp.c
@@ -203,6 +203,7 @@ struct exasock_tcp
/* fin handshake and garbage collection work */
struct delayed_work fin_work;
struct delayed_work gc_work;
+ struct delayed_work free_work;
/* list node for queueing on the tw death row */
struct list_head death_link;
@@ -261,6 +262,13 @@ struct exasock_tcp_req
struct sk_buff_head skb_queue;
};
+struct exasock_tcp_free_req
+{
+ struct exasock_tcp *tcp;
+ struct rcu_head rcu;
+};
+
+
static struct hlist_head * tcp_buckets;
static DEFINE_SPINLOCK( tcp_bucket_lock);
@@ -268,6 +276,7 @@ static struct sk_buff_head tcp_packets;
static struct workqueue_struct *tcp_workqueue;
/* for running deferred calls to exasock_tcp_gc_worker() */
static struct workqueue_struct *tcp_gc_workqueue;
+static struct workqueue_struct *tcp_free_workqueue;
static struct delayed_work tcp_rx_work;
static struct hlist_head * tcp_req_buckets;
@@ -385,7 +394,10 @@ static struct exasock_tcp_req *exasock_tcp_req_lookup(uint32_t local_addr,
* note: do not call it directly, only run on
* tcp_gc_workqueue */
static void exasock_tcp_gc_worker(struct work_struct *work);
+static void exasock_tcp_gc_callback(struct rcu_head *rp);
static void exasock_tcp_dead(struct kref *ref);
+/* exasock_tcp_gc implementation details */
+static void exasock_tcp_free_worker(struct work_struct *work);
/* this function performs some clean-up before
* deferring exasock_tcp_gc_worker to the cleanup workqueue */
static void exasock_tcp_free(struct exasock_tcp *tcp);
@@ -435,6 +447,8 @@ static inline void exasock_tcp_kill_stray(void)
rcu_read_unlock();
flush_workqueue(tcp_gc_workqueue);
destroy_workqueue(tcp_gc_workqueue);
+ flush_workqueue(tcp_free_workqueue);
+ destroy_workqueue(tcp_free_workqueue);
}
#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 13, 0)
@@ -1215,9 +1229,10 @@ struct exasock_tcp *exasock_tcp_alloc(struct socket *sock, int fd)
INIT_DELAYED_WORK(&tcp->work, exasock_tcp_conn_worker);
queue_delayed_work(tcp_workqueue, &tcp->work, TCP_TIMER_JIFFIES);
- INIT_DELAYED_WORK(&tcp->win_work, exasock_tcp_conn_win_worker);
- INIT_DELAYED_WORK(&tcp->fin_work, exasock_tcp_close_worker);
- INIT_DELAYED_WORK(&tcp->gc_work, exasock_tcp_gc_worker);
+ INIT_DELAYED_WORK(&tcp->win_work, exasock_tcp_conn_win_worker);
+ INIT_DELAYED_WORK(&tcp->fin_work, exasock_tcp_close_worker);
+ INIT_DELAYED_WORK(&tcp->gc_work, exasock_tcp_gc_worker);
+ INIT_DELAYED_WORK(&tcp->free_work, exasock_tcp_free_worker);
INIT_LIST_HEAD(&tcp->death_link);
hash = exasock_tcp_hash(tcp->local_addr, tcp->peer_addr,
@@ -1531,13 +1546,32 @@ static void exasock_tcp_gc_worker(struct work_struct *work)
{
struct delayed_work *dwork = container_of(work, struct delayed_work, work);
struct exasock_tcp *tcp = container_of(dwork, struct exasock_tcp, gc_work);
- int i;
+ struct exasock_tcp_free_req *req = kmalloc(sizeof(*req), GFP_KERNEL);
+ if (req == NULL) {
+ return;
+ }
spin_lock(&tcp_wait_death_lock);
list_del_rcu(&tcp->death_link);
spin_unlock(&tcp_wait_death_lock);
- synchronize_rcu();
+ req->tcp = tcp;
+ call_rcu(&req->rcu, exasock_tcp_gc_callback);
+ // synchronize_rcu();
+}
+
+static void exasock_tcp_gc_callback(struct rcu_head *rp)
+{
+ struct exasock_tcp_free_req *req = container_of(rp, struct exasock_tcp_free_req, rcu);
+ queue_delayed_work(tcp_free_workqueue, &req->tcp->free_work, 0);
+ kfree(req);
+}
+
+static void exasock_tcp_free_worker(struct work_struct *work)
+{
+ struct delayed_work *dwork = container_of(work, struct delayed_work, work);
+ struct exasock_tcp *tcp = container_of(dwork, struct exasock_tcp, free_work);
+ int i;
/* Wait for refcount to go to 0 */
kref_put(&tcp->refcount, exasock_tcp_dead);
@@ -3482,6 +3516,13 @@ int __init exasock_tcp_init(void)
goto gc_wq_null;
}
+ tcp_free_workqueue = create_workqueue("exasock_tcp_free");
+ if (tcp_free_workqueue == NULL)
+ {
+ err = -ENOMEM;
+ goto free_wq_null;
+ }
+
INIT_DELAYED_WORK(&tcp_rx_work, exasock_tcp_rx_worker);
INIT_DELAYED_WORK(&tcp_req_work, exasock_tcp_req_worker);
queue_delayed_work(tcp_workqueue, &tcp_req_work, TCP_TIMER_JIFFIES);
@@ -3503,6 +3544,8 @@ err_ate_client_register:
intercept_failed:
cancel_delayed_work_sync(&tcp_req_work);
flush_workqueue(tcp_workqueue);
+ destroy_workqueue(tcp_free_workqueue);
+free_wq_null:
destroy_workqueue(tcp_gc_workqueue);
gc_wq_null:
destroy_workqueue(tcp_workqueue);
Any chance to have Exanic support for RHEL9, which was released on 18th of May?
make[1]: Entering directory '/home/adi/work/git/repos/exanic-software/modules/exanic'
make -C /lib/modules/`uname -r`/build M=$PWD modules
make[2]: Entering directory '/usr/src/kernels/5.14.0-70.17.1.el9_0.x86_64'
CC [M] /home/adi/work/git/repos/exanic-software/modules/exanic/exanic-main.o
/home/adi/work/git/repos/exanic-software/modules/exanic/exanic-main.c:2091:1: error: expected ',' or ';' before 'MODULE_SUPPORTED_DEVICE'
2091 | MODULE_SUPPORTED_DEVICE(DRV_NAME);
| ^~~~~~~~~~~~~~~~~~~~~~~
make[3]: *** [scripts/Makefile.build:271: /home/adi/work/git/repos/exanic-software/modules/exanic/exanic-main.o] Error 1
make[2]: *** [Makefile:1862: /home/adi/work/git/repos/exanic-software/modules/exanic] Error 2
make[2]: Leaving directory '/usr/src/kernels/5.14.0-70.17.1.el9_0.x86_64'
make[1]: *** [Makefile:17: default] Error 2
make[1]: Leaving directory '/home/adi/work/git/repos/exanic-software/modules/exanic'
make: *** [Makefile:9: modules] Error 2```
Taking a look at LinkedIn it looks like the primary contributors are no longer working for Exablaze/Cisco. There also have been no updates in over a year to the hardware devkit and there are still outstanding bugs dating to 2019.
Hoping someone gets notified about this and can respond. Certain issues can be fixed locally but official updates from Cisco would be great.
exanic-software-2.7.3 on rhel9.0-5.14-70.13.1.0.3, use exanic-clock-sync or ptp4l both have 300ns bias between HW timestamp and System clock. is there anybody has a clue on this ?
Hey
I have been testing the trigger example provided with exanic x25 FDK. The trigger example along with the software application provided by exanic-software exasock-tcp-responder-example is working fine at 1Gbps data rate.
I have developed the FDK using VARIANT=full_multirate which supports 1/10G data rates. When I'm trying to run the same firmware at 10G speed, somehow the x25 is not able to send out the data. The data can be properly seen at tx_xx_net but the the tcp server the application is connected does not receive any data.
When 10G speed is being used for testing, the switch is connected to the NIC using 1m cisco SFP-H10-CU1M 10G SFP+ twinax cable. Attaching the code snip just fro reference.
If anybody can suggest where things are going wrong, it would be very helpful.
code_snip.txt
exanic-main.c:17:10: fatal error: linux/pci-aspm.h: No such file or directory
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?id=7ce2e76a0420
I'm reading the implementation of the exanic_receive_frame
function. Looks like there is a potential bug in the way how lock-free queue is implemented.
The implementation does not handle the case when consumer thread (thread that calls exanic_receive_frame
function) reads the chunk and at the same time producer thread (FPGA) stores the new portion of data after new round of circle (generation is increased).
It has to be a very slow application to reproduce this issue.
Following is a sequence of actions that cause the bug:
The lock free circular buffer usually is implemented in the following way:
Producer:
PrevChunkSeqNum+1
to the generation of the chunkPrevChunkSeqNum+1
is written before we write the dataPrevChunkSeqNum+2
is setPrevChunkSeqNum += 2
Consumer:
ChunkSeqNum
. If value is odd, goto 1;ChunkSeqNum
before data is readChunkSeqNum
checkChunkSeqNum
and compare to what we read at step 1. If it is changed, means that chunk is updated while we read the data; thus goto 1 and start from scratchThe filter-common.h needs cplusplus guard like exanic.h
It is unclear from the product website and the documentation if the IPv6 is supported.
A quick glance at the code it appreas that only IPv4 is supported.
Are there plans to rectify this and support IPv6, IGMP, and other IPv6 related infrastructure stack (ND, RD)?
We have discovered a bug in exanic_ip_free(). The patch is self-explanation we think.
$ diff -u libs/exasock/exanic.c patch/exasock/exanic.c
--- libs/exasock/exanic.c 2018-12-27 11:56:58.000000000 +0800
+++ patch/exasock/exanic.c 2019-03-21 13:46:21.173650844 +0800
@@ -310,6 +310,7 @@
static void
exanic_ip_free(struct exanic_ip *ctx)
{
Implementation broken when using non-blocking sockets and/or MSG_DONTWAIT. Returns either error or 'vlen' worth of messages. Discarding any messages and returing when an error occurs.
https://man7.org/linux/man-pages/man2/recvmmsg.2.html
If an error occurs after at least one message has been received, the
call succeeds, and returns the number of messages received. The
error code is expected to be returned on a subsequent call to
recvmmsg(). In the current implementation, however, the error code
can be overwritten in the meantime by an unrelated network event on a
socket, for example an incoming ICMP packet.
We have two applications running, one without exasock and one with exasock. The one without exasock crashes and after it crashes it is impossible to start without bouncing the port. The connection seems to get stuck in SYN-SENT (via exasock-stat). We are using exanic x25, and driver version 2.6.1. The problem is that we can't use exasock for both because we run out of tx buffers on the other one. We're not sure why this happens (the hang after epoll), but it only happens on x25 and not the x10. Please let me know how we can debug further on our end or if you need more information.
We have discovered a possible race condition between exasock_tcp_send_advance
and ACKs to remote hosts' packets. Consider this case:
exasock_tcp_send_advance
is called.In this case remote host receives an "impossible ACK": under no normal circumstances SEQ in packet B can be lesser than SEQ in packet A, yet because kernel module and a userspace application sending packets run in different threads this can theoretically happen. We have observed this in real setting because we have random delays possible between sending packets via libexanic and calling exasock_tcp_send_advance
.
A different vendor, Solarflare, handles this by deliberately setting SEQ value in empty ACKs to a value from the future, namely send_seq + min(rwnd_len, cwnd_len, mss)
(a bit more complicated than that but you get the picture). This way technically those ACKs are always correct and just appear severely out of order. An immediate downside of this solution is that traffic sent this way appears severely broken to various analysis tools like Wireshark, and for a good reason so.
Is this race condition dangerous in the wild? Do you have any data on how do various TCP stacks handle "impossible ACKs"? Are there any other solutions to this problem that you see besides the one proposed? We have a patch that implements it in case you wish to experiment but because of the downsides above obviously it's not fit for mainline as is.
When working on the Corundum device driver, I noticed a few things in the exanic kernel module that can be improved.
First, misc_dev.parent
should be set to the device object before registering it. If you do that, then symlinks will be created in sysfs to cross-link the miscdevice and the PCIe device. This change is quite simple:
exanic->misc_dev.name = exanic->name;
exanic->misc_dev.fops = &exanic_fops;
+ exanic->misc_dev.parent = dev;
err = misc_register(&exanic->misc_dev);
(looks like this needs to be done in two places, as there are two calls to misc_register)
Second, you can update exanic_get_sysfs_path
to use that symlink to find the PCIe device from the miscdevice instead of one of the network interfaces by doing realpath
on "/sys/class/misc/%s/device"
with exanic->name
. This could enable things like hot reset to work during firmware updates even if no network interfaces are registered.
Third, miscdevice already handles the find_by_minor stuff internally, and by default private_data
points to the miscdevice
struct. All you need to do to get a reference to the exanic
struct is something like this in open
, release
, mmap
, and ioctl
:
struct miscdevice *miscdev = filp->private_data;
struct exanic *exanic = container_of(miscdev, struct exanic, misc_dev);
(and don't forget to leave filp->private_data
alone in exanic_open
)
The compiler returns this error:
exanic-devkit-extended-memory-write.c:19:5: error: ‘for’ loop initial declarations are only allowed in C99 mode
for (int i = 0; i < BUFFER_SIZE; i++)
Please add git tags for each release/version to make it easier to debug, diff, and package this software.
Also, can you confirm that the current master is 2.0.1 and are there any differences between what's in this repository and the tarball in the RPM/DEB?
Thank you!
I tried 2.7.0 on kernel version 5.19.2. And I entercountered some wired bugs on exanic version 2.7.0 and kernel version 5.19.2. These bugs dissapeared after I add #include <linux/ethtool.h>
in exanic-phyops.h
As already mentioned in multiple places, already known by Cisco and it's increasingly becoming quite an issue for our developers, who need to stick to certain kernel where building the driver is needed. Are there efforts underway for code updates?
Build error:
/build/exanic-software/modules/exanic/exanic-netdev.c:1581:31: error: initialization of 'int (*)(struct net_device *, struct ethtool_coalesce *, struct kernel_ethtool_coalesce *, struct netlink_ext_ack *)' from incompatible pointer type 'int (*)(struct net_device *, struct ethtool_coalesce *)' [-Werror=incompatible-pointer-types]
.get_coalesce = exanic_netdev_get_coalesce,
^~~~~~~~~~~~~~~~~~~~~~~~~~
/build/exanic-software/modules/exanic/exanic-netdev.c:1581:31: note: (near initialization for 'exanic_ethtool_ops.<anonymous>.get_coalesce')
/build/exanic-software/modules/exanic/exanic-netdev.c:1582:31: error: initialization of 'int (*)(struct net_device *, struct ethtool_coalesce *, struct kernel_ethtool_coalesce *, struct netlink_ext_ack *)' from incompatible pointer type 'int (*)(struct net_device *, struct ethtool_coalesce *)' [-Werror=incompatible-pointer-types]
.set_coalesce = exanic_netdev_set_coalesce,
^~~~~~~~~~~~~~~~~~~~~~~~~~
/build/exanic-software/modules/exanic/exanic-netdev.c:1582:31: note: (near initialization for 'exanic_ethtool_ops.<anonymous>.set_coalesce')
Currently we are developing our application using exasock due to that we are facing problem in debugging our application using gdb as,
we have to execute our application with exasock in front i.e.(exasock ./appname) due to that gdb is unable to debug the application.
Communication works well; ping, sockperf and whatever. However when exasock
is added to the mix, the following error returns on connect()
: EADDRNOTAVAIL (Cannot assign requested address)
.
Server side: nc -l 192.168.80.2 11111
Exablaze side: nc -s 192.168.80.3 192.168.80.2 11111
✔️
Exablaze side with exasock: EXASOCK_DEBUG=1 exasock --trace --debug nc -s 192.168.80.3 192.168.80.2 11111
:
[pid 7524] signal(13, SIG_IGN) = SIG_DFL
[pid 7524] read(5, "0\n", 2) = 2
[pid 7524] close(5) = 0
[pid 7524] read(4, "\372Y\220\216\334\34}\304\tC\23\270U\243\v\304(\276\330\330\330\345:\305\24R\246\202H\177\233e"..., 236) = 236
[pid 7524] close(4) = 0
[pid 7524] socket(AF_INET, SOCK_STREAM, 6) = 4
[pid 7524] fcntl(4, F_GETFL) = 2
[pid 7524] fcntl(4, F_SETFL, O_NONBLOCK|0x2) = 0
[pid 7524] setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 7524] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.80.3")}, 16) = 0
[pid 7524] connect(4, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("192.168.80.2")}, 16exasock: enabled bypass on fd 4
) = -1 EADDRNOTAVAIL (Cannot assign requested address)
Ncat: Cannot assign requested address.
Running exanic-2.7.3-1.el9.x86_64
, but also the provided fc32 RPMs yielded the same.
With UDP (nc -u
) it seems to work with exasock, however, no accelerated sockets are seen using exasock-stat
. Suspecting the socket is not accelerated and this is why it does work.
/*
Simple udp server
*/
#include<stdio.h> //printf
#include<string.h> //memset
#include<stdlib.h> //exit(0);
#include<arpa/inet.h>
#include<sys/socket.h>
#define BUFLEN 512 //Max length of buffer
#define PORT 8888 //The port on which to listen for incoming data
void die(char *s)
{
perror(s);
exit(1);
}
int main(void)
{
struct sockaddr_in si_me, si_other;
int s, i, slen = sizeof(si_other) , recv_len;
char buf[BUFLEN];
//create a UDP socket
if ((s=socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1)
{
die("socket");
}
// zero out the structure
memset((char *) &si_me, 0, sizeof(si_me));
si_me.sin_family = AF_INET;
si_me.sin_port = htons(PORT);
si_me.sin_addr.s_addr = htonl(INADDR_ANY);
//bind socket to port
if( bind(s , (struct sockaddr*)&si_me, sizeof(si_me) ) == -1)
{
die("bind");
}
//keep listening for data
while(1)
{
printf("Waiting for data...");
fflush(stdout);
//try to receive some data, this is a blocking call
if ((recv_len = recvfrom(s, buf, BUFLEN, 0, (struct sockaddr *) &si_other, &slen)) == -1)
{
die("recvfrom()");
}
//print details of the client/peer and the data received
printf("Received packet from %s:%d\n", inet_ntoa(si_other.sin_addr), ntohs(si_other.sin_port));
printf("Data: %s\n" , buf);
//now reply the client with the same data
if (sendto(s, buf, recv_len, 0, (struct sockaddr*) &si_other, slen) == -1)
{
die("sendto()");
}
}
close(s);
return 0;
}
/*
Simple udp client
*/
#include<stdio.h> //printf
#include<string.h> //memset
#include<stdlib.h> //exit(0);
#include<arpa/inet.h>
#include <sys/epoll.h>
#include<sys/socket.h>
#include <errno.h>
#define SERVER "127.0.0.1"
#define BUFLEN 512 //Max length of buffer
#define PORT 8888 //The port on which to send data
void die(char *s)
{
perror(s);
exit(1);
}
int main(void)
{
struct sockaddr_in si_other;
int s, i, slen=sizeof(si_other);
char buf[BUFLEN];
char message[BUFLEN];
int epfd;
struct epoll_event ee;
struct epoll_event eelist[10];
int ret;
epfd=epoll_create(1);
if ( (s=socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1)
{
die("socket");
}
memset((char *) &si_other, 0, sizeof(si_other));
si_other.sin_family = AF_INET;
si_other.sin_port = htons(PORT);
if (inet_aton(SERVER , &si_other.sin_addr) == 0)
{
fprintf(stderr, "inet_aton() failed\n");
exit(1);
}
ee.events=EPOLLIN;
ee.data.fd=s;
epoll_ctl(epfd,EPOLL_CTL_ADD,s,&ee);
while(1)
{
printf("Enter message : ");
gets(message);
//send the message
if (sendto(s, message, strlen(message) , 0 , (struct sockaddr *) &si_other, slen)==-1)
{
die("sendto()");
}
//receive a reply and print it
//clear the buffer by filling null, it might have previously received data
memset(buf,'\0', BUFLEN);
//try to receive some data, this is a blocking call
ret=epoll_wait(epfd,eelist,10,100);
if(ret==-1) {
printf("epoll_wait failed,errno=%d,errstr=%s\n",errno,strerror(errno));
return -1;
}
if(ret==0) {
printf("epoll_wait timeout\n");
return -1;
}
printf("epoll_wait ret=%d\n",ret);
if (recvfrom(s, buf, BUFLEN, 0, (struct sockaddr *) &si_other, &slen) == -1)
{
die("recvfrom()");
}
puts(buf);
}
close(s);
return 0;
}
Native linux is ok.
Onload from sloarflare is ok.
Exasock is failed.
This issue come from accelecom company in china
Hello, if somebody is monitoring the issues here.
The problem:
When the application runs with exasock and an accelerated TCP socket calls connect() it passes through exa_socket_update_interfaces then to exanic_ip_acquire, correctly determines the interface from the dst IP, but fails to map the network interface to ExaNIC port. More specifically, the problem is with libexanic - exanic_find_port_by_interface_name fails to provide device:port for the interface
despite this interface surely belongs to ExaNIC card and exanic-config lists it correctly. In other words, the mapping of
ports to the network interfaces in exanic_t if_index is 100% correct, but the reverse mapping of the interface to the port fails due to
ioctl(fd, EXAIOCGIFINFO, &ifr) call returning -1 (ioctl there actually seems to be the OS one with some wrappers). This problem arises on two different machines each with the same ExaNIC X100 card and under both RHEL8 and RHEL9 OS. The version
of exanic-software under consideration is the lastest: release 2.7.3, 13 Oct. 2022.
Calling bind() to INADDR_ANY or to the interface IP doesn't help. Is there any subtlety with exasock? Or some specific function must be called after a socket creation and before connect()? The application uses usual AF_INET sockets, exasock is called as a wrapper as written in the documentation.
Hi,
I was wondering what's the best way to parse the payload out of an udp ethernet frame when using one of the exanic cards? I am only familiar with libpcap, but was wondering if there's a more performant way when using an exablaze card/libexanic?
Thank you!
Prior to Sept 15, 2021, there have been regular commits and releases via this repository. But there have been none subsequently. Have you shifted to any internal repo? How do we get the updates? I noticed that the FDK 2.9.0 has been released here, but no download options for the exanic-software package.
I'm having problems with DROP_MEMBERSHIP + exasock.
When I comment the unsubscribe line of my application, the program runs normally. If present, the unsubscribe results in a segmentation fault that occur at libexasock side.
I discovered the option "--debug" and I have the following error:
a.out: structs.h:487: exa_hashtable_mcast_remove: Assertion `memb_to_insert != ((void *)0)' failed.
Aborted (core dumped)
To repeat the problem, you can use this minimal program. It does ADD_MEMBERSHIP and after a interval it does DROP_MEMBERSHIP:
#include <sys/types.h> /* See NOTES */
#include <sys/socket.h>
#include <netinet/in.h>
#include <cstring>
#include <arpa/inet.h>
#include <stdexcept>
#include <sys/epoll.h>
#include <unistd.h>
#include <iostream>
int main()
{
const char *interface = "192.168.2.2";
constexpr int SIZE_OF_BUFFER = 2048;
char buffer[SIZE_OF_BUFFER];
int fd0 = socket(AF_INET, SOCK_DGRAM, 0);
struct sockaddr_in servaddr;
memset(&servaddr, 0, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(10001);
servaddr.sin_addr.s_addr = inet_addr("233.252.14.1");
if (::bind(fd0, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0)
{
throw std::runtime_error("socket couldn't bind.");
}
int epoll_fd = epoll_create(1);
struct epoll_event event;
event.events = EPOLLIN;
event.data.u32 = 10000;
epoll_ctl(epoll_fd, EPOLL_CTL_ADD, fd0, &event);
ip_mreq mreq;
mreq.imr_multiaddr.s_addr = inet_addr("233.252.14.1");
mreq.imr_interface.s_addr = inet_addr(interface);
if (setsockopt(fd0, IPPROTO_IP, IP_ADD_MEMBERSHIP,
&mreq, sizeof(mreq)) < 0)
{
throw std::runtime_error(
std::string("subscribe: ") + strerror(errno) );
}
int j = 1;
for (;;)
{
constexpr int max_events = 64;
struct epoll_event events[max_events];
int n = epoll_wait(epoll_fd, events, max_events, 0);
for (int i = 0; i < n; i++)
{
std::cout << "events[i].data.u32: " << events[i].data.u32 << std::endl;
if(events[i].data.u32 == 10000)
{
int n_bytes = recvfrom(fd0, buffer, SIZE_OF_BUFFER, 0 , 0, 0);
j++;
std::cout << "j= " << j << std::endl;
std::cout << "n_bytes: " << n_bytes << std::endl;
if(j == 100)
{
mreq.imr_multiaddr.s_addr = inet_addr("233.252.14.1");
mreq.imr_interface.s_addr = inet_addr(interface);
if (setsockopt(fd0, IPPROTO_IP, IP_DROP_MEMBERSHIP,
&mreq, sizeof(mreq)) < 0)
{
throw std::runtime_error(
std::string("unsubscribe: ") + strerror(errno) );
}
}
}
}
}
}
To run:
$ exasock --debug ./a.out
I would like to know if my software is using correctly the lib.
I will try to debug this assert to find a fix in the meantime.
Hi Engineer Team,
X100 supports Port Mirroring from 20211008, can you update https://github.com/cisco/exanic-software/blob/master/RELEASE-NOTES.txt as well?
Thanks.
Hi
I just installed a fresh copy of this repo from source and I keep getting this error when running "exanic-config".
Any ideas what i might be doing wrong?
I have tried the card on a different computer and it works just fine.
$ sudo exanic-config
Device exanic0:
Hardware type: ExaNIC X4
exanic0 sysfs path: invalid port number
Temperature: 53.6 C VCCint: 1.00 V VCCaux: 1.81 V
Fan speed: 0 RPM
Function: network interface
Firmware date: 20170323 (Thu Mar 23 07:30:24 2017)
Bridging: off
Thank you!
Steps to repeat the problem:
1- Generate the .fw using exanic devkit + trigger_example
2- Update the firmware
3- Download this software source
4- make && make install
5- cd exanic-software/examples/devkit
6- gcc exasock-tcp-responder-example.c -lexanic -lexasock_ext
7- $ ./a.out 541 192.168.1.50 8000
Connected to 192.168.1.50:8000
exasock_tcp_get_device: Operation not supported
I've noticed the presence of the "/dev/exasock" as well.
Ive noticed a similar thread where On Ubuntu/Debian,
"sudo update-initramfs -k all -u""
worked as solution. But for me unfortunately it didnt help much.
I am trying for hours to solve this problem, it would be of much help if ypu could help me resolve it
Subject: [PATCH] getpeername should check if tcp connection has been fully
established
libs/exasock/exanic.c | 8 ++++++++
libs/exasock/exanic.h | 1 +
libs/exasock/socket/socket.c | 6 +++++-
libs/exasock/tcp.h | 8 ++++++++
4 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/libs/exasock/exanic.c b/libs/exasock/exanic.c
index 234cf04..8bf8567 100644
--- a/libs/exasock/exanic.c
+++ b/libs/exasock/exanic.c
@@ -1529,6 +1529,14 @@ exanic_tcp_connecting(struct exa_socket * restrict sock)
return exa_tcp_connecting(&ctx->tcp);
}
+bool
+exanic_tcp_established(struct exa_socket * restrict sock)
+{
bool
exanic_tcp_listening(struct exa_socket * restrict sock)
{
diff --git a/libs/exasock/exanic.h b/libs/exasock/exanic.h
index 117268a..0ee78bf 100644
--- a/libs/exasock/exanic.h
+++ b/libs/exasock/exanic.h
@@ -50,6 +50,7 @@ void exanic_tcp_connect(struct exa_socket * restrict sock,
void exanic_tcp_shutdown_write(struct exa_socket * restrict sock);
void exanic_tcp_reset(struct exa_socket * restrict sock);
bool exanic_tcp_connecting(struct exa_socket * restrict sock);
+bool exanic_tcp_established(struct exa_socket * restrict sock);
bool exanic_tcp_listening(struct exa_socket * restrict sock);
bool exanic_tcp_writeable(struct exa_socket * restrict sock);
bool exanic_tcp_write_closed(struct exa_socket *sock);
diff --git a/libs/exasock/socket/socket.c b/libs/exasock/socket/socket.c
index 0f1ffb6..801bd4b 100644
--- a/libs/exasock/socket/socket.c
+++ b/libs/exasock/socket/socket.c
@@ -1199,6 +1199,7 @@ int
getpeername(int sockfd, struct sockaddr *addr, socklen_t *addrlen)
{
struct exa_socket * restrict sock = exa_socket_get(sockfd);
bool connected = false;
int ret;
TRACE_CALL("getpeername");
@@ -1209,7 +1210,10 @@ getpeername(int sockfd, struct sockaddr *addr, socklen_t *addrlen)
{
exa_read_lock(&sock->lock);
if (!sock->connected)
connected = sock->connected;
if (SOCK_STREAM == sock->type)
connected = exanic_tcp_established(sock);
if (!connected)
{
errno = ENOTCONN;
ret = -1;
diff --git a/libs/exasock/tcp.h b/libs/exasock/tcp.h
index b07f217..97be50d 100644
--- a/libs/exasock/tcp.h
+++ b/libs/exasock/tcp.h
@@ -261,6 +261,14 @@ exa_tcp_connecting(struct exa_tcp_conn * restrict ctx)
state->state == EXA_TCP_SYN_RCVD;
}
+static inline bool
+exa_tcp_established(struct exa_tcp_conn * restrict ctx)
+{
2.14.0-rc0
pci_set_dma_mask
and pci_set_consistent_dma_mask
were deprecated and later removed in Linux 5.18.
DKMS make.log for exanic-2.7.3.2-git for kernel 6.1.12-060112-generic (x86_64)
Sat Feb 18 15:52:39 MSK 2023
make: Entering directory '/var/lib/dkms/exanic/2.7.3.2-git/build/modules'
make -C /lib/modules/6.1.12-060112-generic/build M=$PWD modules
make[1]: Entering directory '/usr/src/linux-headers-6.1.12-060112-generic'
warning: the compiler differs from the one used to build the kernel
The kernel was built by: x86_64-linux-gnu-gcc-9 (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
You are using: gcc-9 (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
CC [M] /var/lib/dkms/exanic/2.7.3.2-git/build/modules/exanic/exanic-main.o
/var/lib/dkms/exanic/2.7.3.2-git/build/modules/exanic/exanic-main.c: In function ‘exanic_probe’:
/var/lib/dkms/exanic/2.7.3.2-git/build/modules/exanic/exanic-main.c:1159:11: error: implicit declaration of function ‘pci_set_dma_mask’ [-Werror=implicit-function-declaration]
1159 | err = pci_set_dma_mask(pdev, DMA_BIT_MASK(exanic->dma_addr_bits));
| ^~~~~~~~~~~~~~~~
/var/lib/dkms/exanic/2.7.3.2-git/build/modules/exanic/exanic-main.c:1166:11: error: implicit declaration of function ‘pci_set_consistent_dma_mask’ [-Werror=implicit-function-declaration]
1166 | err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(exanic->dma_addr_bits));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:250: /var/lib/dkms/exanic/2.7.3.2-git/build/modules/exanic/exanic-main.o] Error 1
make[2]: *** [scripts/Makefile.build:500: /var/lib/dkms/exanic/2.7.3.2-git/build/modules/exanic] Error 2
make[1]: *** [Makefile:2011: /var/lib/dkms/exanic/2.7.3.2-git/build/modules] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.1.12-060112-generic'
make: *** [Makefile:9: default] Error 2
make: Leaving directory '/var/lib/dkms/exanic/2.7.3.2-git/build/modules'
Hi,
I am sure this is slightly off topic in terms of software, but I thought it could be the fastest way to get an accurate answer.
I just purchased a 2nd hand X4 rev 2 and even though the card is detected, and I was upload to update the firmware to the last version, the fan does not work (exanic config shows Fan speed: 0 rpm, and I can confirm that the fan doesn't rotate).
Could this be some software problem or is for sure hardware?
Currently the card is running at 78-79º celsius.
I am trying to make sure that the error is not on my side, prior to returning the card.
Any help is appreciated.
Thank you.
Note: At the moment I don't have any cables attached as I am still waiting for delivery on those. Could this be the problem?
Hi Engineer Team,
Please check below two links for 2.7.3:
https://github.com/cisco/exanic-software/archive/refs/tags/v2.7.3.zip
https://github.com/cisco/exanic-software/archive/refs/tags/v2.7.3.tar.gz
The driver version shows "2.7.2-git" in "exanic_version.h" file, therefore, we found "2.7.2-git" message after run "modinfo exanic".
We tried master link "https://github.com/cisco/exanic-software/archive/refs/heads/master.zip", no issue found.
anybody can help?
maybe because this delay:
exasock_tcp_intercept()
...
queue_delayed_work(tcp_workqueue, &tcp_tx_work, 1)
....
am i right?
It appears that exasock won't terminate nor time out sockets in FIN-WAIT-2
state exasock process is terminated in middle of receiving data from a peer leaving those sockets lingering until system is rebooted.
Here is a simple reproducible scenario using netcat
:
base64 /dev/urandom | head -c 10000000 > file.txt
exasock
is not necessary)cat file.txt - | nc -l 10.248.194.179 12345
exasock
exasock nc 10.248.194.179 12345
Ctrl+C
FIN-WAIT-2
stateexasock-stat
Example:
exasock-stat
Active ExaNIC Sockets accelerated connections (servers and established):
Proto | Recv-Q | Send-Q | Local Address | Foreign Address | State
TCP | 1048396 | 0 | 10.248.194.178:49765 | 10.248.194.179:12345 | FIN-WAIT-2
TCP | 1048556 | 0 | 10.248.194.178:35827 | 10.248.194.179:12345 | FIN-WAIT-2
In addition to ports allocated and resources leaking after reviewing the code I also discovered that kernel module will keep re-evaluating those sockets indefinitely as well.
Hi,
I am using exanic_receive_frame with flow steering to read a UDP frame from a multicast group. Then I parse the received data frame according to the network frame format. After parsing, I found that the length of the received frame is 60 bytes more than the size of frame headers + size of UDP payload. This extra 60 bytes happen to be at the end of the frame. What could be this extra 60 bytes at the end of the frame? We are certain that we didn't send that 60 bytes in to the multicast group.
The following is some code snippets I used to parse the frame.
// "packet" is the userspace receive buffer, "length" is the length of the frame
const struct ether_header* ethernetHeader = (struct ether_header*)packet;
if (ntohs(ethernetHeader->ether_type) == ETHERTYPE_IP) {
const struct ip* ipHeader = (struct ip*)(packet + sizeof(struct ether_header));
int ipHeaderLen = ipHeader->ip_hl * 4;
if (ipHeader->ip_p == IPPROTO_UDP) {
const struct udphdr* udpHeader = (struct udphdr*)(packet + sizeof(struct ether_header) + ipHeaderLen);
int udpPayloadLen = ntohs(udpHeader->len) - sizeof(struct udphdr);
const uint8_t* payload= packet + sizeof(struct ether_header) + ipHeaderLen + sizeof(struct udphdr);
/*
we found out that the frame length == sizeof(struct ether_header) + ipHeaderLen + sizeof(struct udphdr) + udpPayloadLen + **60**
What are the mysterious 60 bytes at the end?
*/
}
}
Thank you very much.
Hi, trying to automate Exanic software installation and hit a snag. If dkms fails in %post it breaks the anaconda installation. This spec change avoids the failure and allows the installation to complete and for the exanic modules to be built at a later stage:
%post dkms
dkms add -m %{name} -v %{version}-%{release} --rpm_safe_upgrade || true
dkms build -m %{name} -v %{version}-%{release} --rpm_safe_upgrade || true
dkms install -m %{name} -v %{version}-%{release} --rpm_safe_upgrade || true
Looking forward to what your thoughts are.
As title, in implementation of socket.c connect_tcp function, it would always return -1 with errno = EINPROGRESS
when trying to setup tcp connection with O_NONBLOCK flag.
Hi,
Imagine a situation when you close a connection but remote is not responding to you at all. exasock_tcp_close_worker
will set socket state to FIN-WAIT-1, after which exasock will send FIN. Since remote provides no response, exasock_tcp_update_state
will not be called and the socket will resend FIN every ~1 second forever. What's worse, it will be a socket lost in the kernel, taking resources, using network, but not accessible from userspace in any way (I guess?).
Is it possible to introduce some kind of retransmission timeout, at least for this exact situation?
There is no shared library for libexanic.
Hey
I have developed an FDK application for placing order based on simple trigger using exanic x25 card. The application was tested in local environment with packet sender running on another system and connecting the application with trigger firmware for establishing the connection and sending a chunk of data out from memory to the connected server. This application works as expected.
When I'm trying to integrate the same application to establish connection and send out data to NSE (National Stock Exchange), the complete ethernet packet along with the correct payload data is observed on the wire (tx_data_net) but no response is received from the NSE in response.
I do not understand if my data is being sent out to NSE or not. If anybody has any inputs on my issue, it would greatly helpful
I am implementing TCP receive functionality using the exanic library's Rx buffer reading (exanic_receive_frame API) and applying filters for our TCP connections. While I can capture our acknowledgement frames successfully, I'm also seeing additional acknowledgements, possibly from the exanic kernel module. These acknowledgements have lower sequence numbers than expected, causing issues with TCP session handling. I need to understand how to update the sequence number in the acknowledgement packet sent by the exanic kernel module or if it's possible to disable this feature.
Steps to repeat the problem:
1- Generate the .fw using exanic devkit + trigger_example
2- Update the firmware
3- Download this software source
4- make && make install
5- cd exanic-software/examples/devkit
6- gcc exasock-tcp-responder-example.c -lexanic -lexasock_ext
7- $ ./a.out 541 192.168.1.50 8000
Connected to 192.168.1.50:8000
exasock_tcp_get_device: Operation not supported
I have noticed exasock is using stub.c which always returns -1 with errno=EOPNOTSUPP
(I know this because when I add printfs in the stub.c I can see them and adding printfs to tcp.c doesn't have any effect)
I've noticed the absence of the "/dev/exasock" as well.
Expectation: setting SO_KEEPALIVE to 0 on socket turns off keepalive logic.
Reality: setting SO_KEEPALIVE to 0 on socket zeroes out userspace keepalive variables but does not zero out kernel keepalive timer. When time of next keepalive packet comes, no "is keepalive enabled" check is performed.
state->p.tcp.keepalive.probes
is zero so tcp->keepalive.probe_cnt < state->p.tcp.keepalive.probes
check is never passed. As a result, keepalive is not turned off; on the contrary, socket is invariably closed with ETIMEDOUT when timer goes to zero.
I fixed it for myself by changing line
https://github.com/exablaze-oss/exanic-software/blob/404da33717db8029557eb884fb1285f7c43bc75e/modules/exasock/exasock-tcp.c#L2970
to
if (tcp->keepalive.timer == 0 && state->p.tcp.keepalive.probes);
but I'm not sure if that's the best way to do so.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.