Code Monkey home page Code Monkey logo

Comments (9)

pierremoreau avatar pierremoreau commented on August 15, 2024

In CMake 3.18, CMAKE_CUDA_ARCHITECTURES was added which gives a way to the user to specify for which architectures to compile (and CMake will automatically forward those values to any CUDA file being compiled). cmake/checkcuda.cu could be changed to return the SM versions of all detected GPUs, and that string gets used if CMAKE_CUDA_ARCHITECTURES was not defined by the user.

from pbrt-v4.

digital-pro avatar digital-pro commented on August 15, 2024

Has anyone done this successfully and have insights to share? We have dockerized pbrt-v4 for GPUs, and made it possible for our students and researchers to remotely render scenes on our servers that have GPUs. Which, by the way, is awesome, and puts us miles ahead of where we were with v3. However, some of our servers have GPUs with multiple architectures, and we've been unable to build a binary (and in turn a Docker image, since we Dockerize everything) that can run on any GPU architecture other than the primary GPU. I've tried fiddling with CMAKE_CUDA_ARCHITECTURE and some other tweaks, but haven't gotten anything to work. Thanks! -- David Cardinal, Vistalab, Stanford

from pbrt-v4.

mmp avatar mmp commented on August 15, 2024

We have dockerized pbrt-v4 for GPUs, and made it possible for our students and researchers to remotely render scenes on our servers that have GPUs.

Cool!

If I do something like this:

% cmake -G Ninja -DPBRT_OPTIX7_PATH=~/optix-7.4.0 ~/pbrt-v4 -DPBRT_GPU_SHADER_MODEL=sm_60

I am able to build a binary that is, as far as I can tell, compiled with the flags to specify shader model 6.0. Is your issue being unable to compile with a specified shader model, unable to compile a single binary with multiple shader models, or finding that the binary is invalid in spite of the above?

FWIW I haven't been able to figure out how to compile a single binary that supports multiple shader models.

from pbrt-v4.

digital-pro avatar digital-pro commented on August 15, 2024

Matt -- Thanks!! It seems to be working. I can build our docker image on the same Linux server for both our 3070 and the 2080 ti's that we were lucky enough to have donated:) That means we have at least 3 GPUs live that people can render on, even if their personal machine is a low-end box. I'm especially happy, as my major area of interest is computational photography, so bursts of images are needed. 

No further info on how to make a single binary for multiple architectures. Do you think any of the new -arch flags could help with that, or maybe they're not relevant for the pbrt compilation pipeline. 

In any case, this is great progress. Thanks!

-- David

from pbrt-v4.

mmp avatar mmp commented on August 15, 2024

Great! As far as I can tell a single binary for multiple architectures should be possible via the "fat binary" functionality of nvcc, but I'm not sure how to wire that up with the cmake stuff. Another issue is that pbrt's OptiX kernels would need to be handled similarly, which I'm not sure how to do either. Anyway, something to hopefully be fixed someday, but glad you're set for now.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on August 15, 2024

From what I understand from the doc, specifying CMAKE_CUDA_ARCHITECTURES="70-real;72-real" will have CMake automatically forward those architectures to nvcc and build for those (I am guessing using its “fat binary” functionality). That would however require bumping to CMake 3.18, but I do not think it would be too difficult to emulate that feature on our own.

Now for the OptiX kernels, compiling to multiple SM versions using CMake should not be too hard but I am not sure how the binaries would be specified for the applications to load them as expected.

from pbrt-v4.

mmp avatar mmp commented on August 15, 2024

Ah, that's helpful. It looks like CMake 3.18 was released in 2020, though, which would require many folks to manually upgrade, which is somewhat undesirable for everyone who doesn't need this functionality.

The OptiX kernels are basically compiled to PTX and then encoded as a big string that's stored in a global variable that's linked into the executable:

extern const unsigned char PBRT_EMBEDDED_PTX[];

That string is passed in to the OptiX API. So "all" that would be necessary there I think would be to do that step multiple times, with different --gpu-architecture settings, have unique variable names, and then choose the appropriate string at for the architecture being used runtime.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on August 15, 2024

How would the unique naming work? Would src/pbrt/gpu/aggregrate.cpp need to to contain something like the following?

extern "C" {
extern const unsigned char PBRT_EMBEDDED_PTX_SM30[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM50[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM60[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM70[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM71[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM72[];
}

And then when calling createOptiXModule(), the right variable would be passed in?

from pbrt-v4.

mmp avatar mmp commented on August 15, 2024

Something like that. Come to think of it another option might be for aggregate.cpp to have an extern std::map<std::string, const char *>> archPTX declaration and to then attempt lookups in that with the selected GPU's architecture. Then the build could automatically generate a .cpp file that had something like:

extern const unsigned char PBRT_EMBEDDED_PTX_SM80[];
// ...

std::map<std::string, const unsigned char *>> archPTX {
    { "sm80", PTX_EMBEDDED_PTX_SM80 },
    // ...
};

from pbrt-v4.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.