Code Monkey home page Code Monkey logo

Comments (7)

tbarbette avatar tbarbette commented on August 24, 2024

It's currently not supported. It would be easy to enable, but then:

  • either scattered buffers would end up in a different Packet objects (one per buffer). The second packet and subsequent would therefore look like garbage. Unless your pipeline is very simple it's not viable.
  • we skip all "subsequent buffers" and only integrate the first buffer with the header as a Packet object. Internally, the subsequent buffers will be linked by DPDK but that will not be exposed to FastClick elements that will see a 64B packet. Therefore elements cannot access the payload of the "silently linked lists" buffers but it will be transmitted as one if forwarded to a DPDK Device.

Even if we imagine providing real support, it would lead to a general performance drop. We'll have to handle batches of linked lists of packets instead of batches of packets everywhere. Many "unlikely" conditions everywhere just for the scattered case...

A simple handling would be to copy scattered packets to a unique buffer, but that'll drop performance of course.

The best course of action would be for Mellanox to send packets to different queues according to their size (like the Tilera they acquired), so we could have queues with different buffer size.

What's the use case?

from fastclick.

cxxuser avatar cxxuser commented on August 24, 2024

I was inspired by the eurosys19 paper(make the most use of the last level cache). And I try to send the packets larger than 64B all to one slice.

from fastclick.

tbarbette avatar tbarbette commented on August 24, 2024

Ah we had discussed that with @aliireza of course , and I think he never advanced towards that for the reason above. The cost of handling scattered packets is huge. That's also why LRO and GRO is used everywhere: it's much better to concatenate payload, even at the cost of zero-copy, than handling more packets/buffers/etc.

Remember a rte_mbuf is 2 cache lines and the second one can't even go to the right slice. So for the few ns of L3 slices you gain, handling many many more descriptors is not worth it. I think it would be more efficient to ask Intel to provide a way to define the cache mapping exactly :p (Way beyond ways of DDIO).

from fastclick.

cxxuser avatar cxxuser commented on August 24, 2024

Thanks a lot for your suggestion, so all I wanted to do was test how much performance overhead it would be.

from fastclick.

cxxuser avatar cxxuser commented on August 24, 2024

I also think your method is the most reasonable one at the moment for the use of NUCA properties. However, I also want to measure how much performance loss other methods have, so I want to turn on the scattered mode in fastclick.

from fastclick.

tbarbette avatar tbarbette commented on August 24, 2024

I'll close the issue because right now it's not supported.

We can discuss ways to implement something though. Maybe we can discuss this offline? [email protected]

from fastclick.

cxxuser avatar cxxuser commented on August 24, 2024

Okey

from fastclick.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.