pbrt currently reports "FATAL CUDA error: invalid device symbol" and dumps a stack tra

In CMake 3.18, <a href="https://cmake.org/cmake/help/latest/variable/CMAKE_CUDA_ARCHIT

Support for multiple GPU SM versions about pbrt-v4 HOT 9 OPEN

mmp commented on August 15, 2024

Support for multiple GPU SM versions

from pbrt-v4.

Comments (9)

pierremoreau commented on August 15, 2024

In CMake 3.18, CMAKE_CUDA_ARCHITECTURES was added which gives a way to the user to specify for which architectures to compile (and CMake will automatically forward those values to any CUDA file being compiled). cmake/checkcuda.cu could be changed to return the SM versions of all detected GPUs, and that string gets used if CMAKE_CUDA_ARCHITECTURES was not defined by the user.

from pbrt-v4.

digital-pro commented on August 15, 2024

Has anyone done this successfully and have insights to share? We have dockerized pbrt-v4 for GPUs, and made it possible for our students and researchers to remotely render scenes on our servers that have GPUs. Which, by the way, is awesome, and puts us miles ahead of where we were with v3. However, some of our servers have GPUs with multiple architectures, and we've been unable to build a binary (and in turn a Docker image, since we Dockerize everything) that can run on any GPU architecture other than the primary GPU. I've tried fiddling with CMAKE_CUDA_ARCHITECTURE and some other tweaks, but haven't gotten anything to work. Thanks! -- David Cardinal, Vistalab, Stanford

from pbrt-v4.

mmp commented on August 15, 2024

We have dockerized pbrt-v4 for GPUs, and made it possible for our students and researchers to remotely render scenes on our servers that have GPUs.

Cool!

If I do something like this:

% cmake -G Ninja -DPBRT_OPTIX7_PATH=~/optix-7.4.0 ~/pbrt-v4 -DPBRT_GPU_SHADER_MODEL=sm_60

I am able to build a binary that is, as far as I can tell, compiled with the flags to specify shader model 6.0. Is your issue being unable to compile with a specified shader model, unable to compile a single binary with multiple shader models, or finding that the binary is invalid in spite of the above?

FWIW I haven't been able to figure out how to compile a single binary that supports multiple shader models.

from pbrt-v4.

digital-pro commented on August 15, 2024

Matt -- Thanks!! It seems to be working. I can build our docker image on the same Linux server for both our 3070 and the 2080 ti's that we were lucky enough to have donated:) That means we have at least 3 GPUs live that people can render on, even if their personal machine is a low-end box. I'm especially happy, as my major area of interest is computational photography, so bursts of images are needed.

No further info on how to make a single binary for multiple architectures. Do you think any of the new -arch flags could help with that, or maybe they're not relevant for the pbrt compilation pipeline.

In any case, this is great progress. Thanks!

-- David

from pbrt-v4.

mmp commented on August 15, 2024

Great! As far as I can tell a single binary for multiple architectures should be possible via the "fat binary" functionality of nvcc, but I'm not sure how to wire that up with the cmake stuff. Another issue is that pbrt's OptiX kernels would need to be handled similarly, which I'm not sure how to do either. Anyway, something to hopefully be fixed someday, but glad you're set for now.

from pbrt-v4.

pierremoreau commented on August 15, 2024

From what I understand from the doc, specifying CMAKE_CUDA_ARCHITECTURES="70-real;72-real" will have CMake automatically forward those architectures to nvcc and build for those (I am guessing using its “fat binary” functionality). That would however require bumping to CMake 3.18, but I do not think it would be too difficult to emulate that feature on our own.

Now for the OptiX kernels, compiling to multiple SM versions using CMake should not be too hard but I am not sure how the binaries would be specified for the applications to load them as expected.

from pbrt-v4.

mmp commented on August 15, 2024

Ah, that's helpful. It looks like CMake 3.18 was released in 2020, though, which would require many folks to manually upgrade, which is somewhat undesirable for everyone who doesn't need this functionality.

The OptiX kernels are basically compiled to PTX and then encoded as a big string that's stored in a global variable that's linked into the executable:

extern const unsigned char PBRT_EMBEDDED_PTX[];

That string is passed in to the OptiX API. So "all" that would be necessary there I think would be to do that step multiple times, with different --gpu-architecture settings, have unique variable names, and then choose the appropriate string at for the architecture being used runtime.

from pbrt-v4.

pierremoreau commented on August 15, 2024

How would the unique naming work? Would src/pbrt/gpu/aggregrate.cpp need to to contain something like the following?

extern "C" {
extern const unsigned char PBRT_EMBEDDED_PTX_SM30[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM50[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM60[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM70[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM71[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM72[];
}

And then when calling createOptiXModule(), the right variable would be passed in?

from pbrt-v4.

mmp commented on August 15, 2024

Something like that. Come to think of it another option might be for aggregate.cpp to have an extern std::map<std::string, const char *>> archPTX declaration and to then attempt lookups in that with the selected GPU's architecture. Then the build could automatically generate a .cpp file that had something like:

extern const unsigned char PBRT_EMBEDDED_PTX_SM80[];
// ...

std::map<std::string, const unsigned char *>> archPTX {
    { "sm80", PTX_EMBEDDED_PTX_SM80 },
    // ...
};

from pbrt-v4.

Support for multiple GPU SM versions about pbrt-v4 HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent