Code Monkey home page Code Monkey logo

Comments (3)

svenevs avatar svenevs commented on September 4, 2024 6

Ok, streams were added in this commit. As expected, there's no real benefit for this library itself. The primary benefit is when a parent project is using streams (as the calls to cudaThreadSynchronize and cudaDeviceSynchronize block all streams). I left the majority of the important code that has changed commented out with the reimplementation just below it. It primarily just consists of changing kernel calls to use a specified stream, and changing memcpy calls to be Async. I may have been a little too zealous with the cudaStreamSynchronize calls, but better safe than sorry. I also commented all of them out just to see and there were not any noticeable performance gains.

It's there if you want it, feel free to close the issue if this level of change is too divergent.

Some pictures of nvprof validating equitable performance:

No Streams


no_streams

With Streams


with_streams

With Streams Zoom


with_streams_zoom

AKA no real overlap going on, but given the relative size of the actual transfers being performed internally this seems to make sense. Zoomed view may reveal overzealous cudaStreamSynchronize, but these were done because the kernel parameters are copied and then used right away (e.g. copy down the total number of points to the CPU and then use it).

Summary


Without user action, everything remains on the default stream (stream 0). If users need the ability to keep cudaSift on a specific stream, this can be done. It is assumed the user creates / destroys the streams themselves, no checks are performed to validate the streams.

Important change is that all of MANAGEDMEM is broken. I don't understand the implications of using managed memory along with streams, so I mostly left that untouched.

from cudasift.

svenevs avatar svenevs commented on September 4, 2024 4

Attaching zip of the fork in case anybody ever needs stream support, they have a place to get started. The attached code diverged from this repository at commit 263df64e0192dc4bbc6fb40c7f6c153822649ddf (back when Maxwell was default branch). So diff against that to see what some of the changes required are, but you'll definitely want to re-do that work against current master because the code base has improved / changed a lot. Note that there are many other changes there, I was using this in a library / child project and needed to support certain things.

CudaSift.zip

Regardless, thanks Mårten for the awesome library, it's been very helpful for me but I don't work with this stuff anymore ;)

from cudasift.

svenevs avatar svenevs commented on September 4, 2024

Hello, I will not be "maintaining" that fork with stream support anymore. It will either be transferred or deleted. Please respond there if you would like the repository transferred to you.

from cudasift.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.