Code Monkey home page Code Monkey logo

Comments (6)

tbarbette avatar tbarbette commented on July 22, 2024

I think the problem is with the caching.

The packet is cloned by Tee element.

  • The first output gets the clone, so the packet is not recognized as a DPDK packet, a mbuf is taken from the mbuf pool and the content is copied inside it, then freed by DPDK.
  • The second packet goes to DiscardNoFree, and its mbuf is lost in the wild.

After a while, there is no more mbuf available, so DPDKDevice::get_pkt() returns null, and (I should catch it but...) memcpy is copying to null...

If it's really a cache, and you're calling rte_pktmbuf_free() yourself on the buffer (or just call kill() after a while), you may just need to set a bigger number of MBUFs with DPDKInfo().

I'm going to push a commit to display a nice message "out of DPDK buffer" instead of a crash anyway...

from fastclick.

tbarbette avatar tbarbette commented on July 22, 2024

For the numa problem, you can enable numa then use THREADOFFSET and MAXTHREADS parameters in the FromDevice.

from fastclick.

davidek avatar davidek commented on July 22, 2024

Hi,

Thanks for the tip with the numa issue: while it turned out not to be enough for me to get the behaviour I want, I went back to the documentation and found the NUMA runtime argument to FROMDPDKDevice: with that disabled (but NUMA enabled at compile time) the issue seems to be gone.

For the sake of completeness, I've also re-ran the test cases that caused the segfaults:

  • The new error message gets indeed triggered when "crash B" described above used to happen. In my specific case it was not a matter of running out of mbufs (it crashes even at the very first packet), just that it was trying to get an mbuf from the NUMA node where the NIC is, which had no pool allocated (because it had no lcore running there).
  • When compiling with NUMA disabled (--with-numa=no), I still get the "crash A" segfault. However, given that I don't need to have NUMA disabled any more, this is no more blocking for our work. This could be just the result of a weired configuration (compiling without NUMA on a NUMA-enabled system), but if you would like to dig more I'll be happy to provide any other details needed.

from fastclick.

tbarbette avatar tbarbette commented on July 22, 2024

The problem is that HAVE_NUMA is in fact HAVE_LIBNUMA, meaning that you have the NUMA library and not really that you use numa or not.

DPDK supports all what I need in libnuma. But I need it for Netmap. So if you disable --with-numa, you'll end up with half NUMA support as DPDK has support for it.

I changed the get_mbuf function so it takes a packet from the pool attached to the device in todpdkddevice (and not the currently running CPU's numa node), that should prevent the problem you describe from happening. I also now get the max socket_id of all used CPUs and used devices, so I'm sure that the pool[ID] won't segfault.

I'll commit tomorrow, I need to test all that a little ;)

from fastclick.

tbarbette avatar tbarbette commented on July 22, 2024

Is it solved by my last commits? (Or the combination of ours I should say)

from fastclick.

davidek avatar davidek commented on July 22, 2024

It appears so: under the same conditions (had to add a StoreEtherAddress in the config to cause the creation of the new dpdk mbuf) it does not crash now. Also the case that used to cause "crash B" works smoothly now, neither segfaults nor triggers the error.

from fastclick.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.