Code Monkey home page Code Monkey logo

Comments (9)

hughperkins avatar hughperkins commented on May 19, 2024

Can you try to create a very tiny test case, eg use the examples in the test/cocl folder as a basis, so I can reproduce the issue on my own machine?

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Oh, you mean, you are calling new inside the kernel?

from coriander.

AJcodes avatar AJcodes commented on May 19, 2024

Oh, you mean, you are calling new inside the kernel?

Yes, new and delete operators are called from inside the kernel

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Oh wow. This is new information for me :-) . A very simple test case I can use would be good though. I doubt I'm going to implement this any time soon. But depends. As far as how to implement it ....

it could be possible actually. This would tie into the new virtual memory management that's sort of evolving. Basically, we'd allocate one gi-normous gpu buffer right at the start, from the hostside, inside coriander, and then just dole out little bites of this when people do cudaMalloc etc hostside. We could then pass this single buffer into kernels, and dole out bits of that to the kernel itself.

Well... hmmm... yeah... that should work. We already started to handle virtual memory device-side,

#define __vmem2__
struct GlobalVars {
local int *scratch;
global char *clmem0;
unsigned long clmem_vmem_offset0;
};
inline global float *getGlobalPointer(__vmem__ unsigned long vmemloc, const struct GlobalVars* const globalVars) {
return (global float *)(globalVars->clmem0 + vmemloc - globalVars->clmem_vmem_offset0);
}

struct GlobalVars {
    local int *scratch;
    global char *clmem0;
    unsigned long clmem_vmem_offset0;
};
inline global float *getGlobalPointer(__vmem__ unsigned long vmemloc, const struct GlobalVars* const globalVars) {
    return (global float *)(globalVars->clmem0 + vmemloc - globalVars->clmem_vmem_offset0);
}

What is your use-case? To what extent can you work around this issue for now?

from coriander.

AJcodes avatar AJcodes commented on May 19, 2024

The use case requires allocating and de-allocating memory on the fly and in parallel, though I'll have to estimate the effort to work around the issue for now.

As for a test case I've taken a sample from the CUDA samples and tweaked it. There is another issue I forgot to bring up, when trying to allocate the allocation limit for a thread, the following error is thrown:

error: use of undeclared identifier
      'cudaLimitMallocHeapSize'
    cudaThreadSetLimit(cudaLimitMallocHeapSize, 128 * (1 << 20));

I've commented it out so you can see the runtime error
newdelete.tar.gz

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

from coriander.

AJcodes avatar AJcodes commented on May 19, 2024

We can store the vmem table at the start of the ginormous buffer, in global memory.

I imagine it would be possible to allocate a certain size, though support for the following function should be considered too
cudaThreadSetLimit(cudaLimitMallocHeapSize, <value>);

How long would it take to have this implemented in Coriander? I ask this because the current project I'm porting has a lot of intertwining dependencies on new and delete, and it would take a long time just to work around these dependencies.

from coriander.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.