Comments (9)
In CMake 3.18, CMAKE_CUDA_ARCHITECTURES
was added which gives a way to the user to specify for which architectures to compile (and CMake will automatically forward those values to any CUDA file being compiled). cmake/checkcuda.cu could be changed to return the SM versions of all detected GPUs, and that string gets used if CMAKE_CUDA_ARCHITECTURES
was not defined by the user.
from pbrt-v4.
Has anyone done this successfully and have insights to share? We have dockerized pbrt-v4 for GPUs, and made it possible for our students and researchers to remotely render scenes on our servers that have GPUs. Which, by the way, is awesome, and puts us miles ahead of where we were with v3. However, some of our servers have GPUs with multiple architectures, and we've been unable to build a binary (and in turn a Docker image, since we Dockerize everything) that can run on any GPU architecture other than the primary GPU. I've tried fiddling with CMAKE_CUDA_ARCHITECTURE and some other tweaks, but haven't gotten anything to work. Thanks! -- David Cardinal, Vistalab, Stanford
from pbrt-v4.
We have dockerized pbrt-v4 for GPUs, and made it possible for our students and researchers to remotely render scenes on our servers that have GPUs.
Cool!
If I do something like this:
% cmake -G Ninja -DPBRT_OPTIX7_PATH=~/optix-7.4.0 ~/pbrt-v4 -DPBRT_GPU_SHADER_MODEL=sm_60
I am able to build a binary that is, as far as I can tell, compiled with the flags to specify shader model 6.0. Is your issue being unable to compile with a specified shader model, unable to compile a single binary with multiple shader models, or finding that the binary is invalid in spite of the above?
FWIW I haven't been able to figure out how to compile a single binary that supports multiple shader models.
from pbrt-v4.
Matt -- Thanks!! It seems to be working. I can build our docker image on the same Linux server for both our 3070 and the 2080 ti's that we were lucky enough to have donated:) That means we have at least 3 GPUs live that people can render on, even if their personal machine is a low-end box. I'm especially happy, as my major area of interest is computational photography, so bursts of images are needed.
No further info on how to make a single binary for multiple architectures. Do you think any of the new -arch flags could help with that, or maybe they're not relevant for the pbrt compilation pipeline.
In any case, this is great progress. Thanks!
-- David
from pbrt-v4.
Great! As far as I can tell a single binary for multiple architectures should be possible via the "fat binary" functionality of nvcc, but I'm not sure how to wire that up with the cmake stuff. Another issue is that pbrt's OptiX kernels would need to be handled similarly, which I'm not sure how to do either. Anyway, something to hopefully be fixed someday, but glad you're set for now.
from pbrt-v4.
From what I understand from the doc, specifying CMAKE_CUDA_ARCHITECTURES="70-real;72-real"
will have CMake automatically forward those architectures to nvcc and build for those (I am guessing using its “fat binary” functionality). That would however require bumping to CMake 3.18, but I do not think it would be too difficult to emulate that feature on our own.
Now for the OptiX kernels, compiling to multiple SM versions using CMake should not be too hard but I am not sure how the binaries would be specified for the applications to load them as expected.
from pbrt-v4.
Ah, that's helpful. It looks like CMake 3.18 was released in 2020, though, which would require many folks to manually upgrade, which is somewhat undesirable for everyone who doesn't need this functionality.
The OptiX kernels are basically compiled to PTX and then encoded as a big string that's stored in a global variable that's linked into the executable:
extern const unsigned char PBRT_EMBEDDED_PTX[];
That string is passed in to the OptiX API. So "all" that would be necessary there I think would be to do that step multiple times, with different --gpu-architecture
settings, have unique variable names, and then choose the appropriate string at for the architecture being used runtime.
from pbrt-v4.
How would the unique naming work? Would src/pbrt/gpu/aggregrate.cpp
need to to contain something like the following?
extern "C" {
extern const unsigned char PBRT_EMBEDDED_PTX_SM30[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM50[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM60[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM70[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM71[];
extern const unsigned char PBRT_EMBEDDED_PTX_SM72[];
}
And then when calling createOptiXModule()
, the right variable would be passed in?
from pbrt-v4.
Something like that. Come to think of it another option might be for aggregate.cpp to have an extern std::map<std::string, const char *>> archPTX
declaration and to then attempt lookups in that with the selected GPU's architecture. Then the build could automatically generate a .cpp file that had something like:
extern const unsigned char PBRT_EMBEDDED_PTX_SM80[];
// ...
std::map<std::string, const unsigned char *>> archPTX {
{ "sm80", PTX_EMBEDDED_PTX_SM80 },
// ...
};
from pbrt-v4.
Related Issues (20)
- The result of pbrt_v4 gpu operation at Pavilon-night is noisy HOT 4
- SubsurfaceMaterial::ToString() is missing the normalMap HOT 1
- Double precision is not working... Optix BVH ? HOT 10
- .obj to pbrt conversion HOT 1
- MIS misbehaves: improvement for direct illumination HOT 2
- pbrt v4 book formula 13.9 maybe error HOT 2
- Material "subsurface" appears to behave strangely.
- imgtool denoise-optix crashing HOT 1
- imgtool: unknown command "denoise-optix". usage: imgtool <command> [options] where <command> is: HOT 8
- -- Unable to find -lprofiler HOT 2
- InlinedVector sometimes behave differently than std::vector HOT 3
- Incorrect SurfaceInteraction for quadric surfaces HOT 1
- GPU Build issue: cuda not found HOT 3
- How to set the CMAKE_CUDA_ARCHITECTURES value? nvcc fatal : Option '--ptx (-ptx)' is not allowed when compiling for multiple GPU architectures HOT 3
- Tests failing on Apple Silicon (Ventura OS) HOT 12
- Warning: image doesn't have Albedo.{R,G,B} channels. Denoising quality may suffer. Warning: output/bistro_vespa.exr: image doesn't have Ns.X, Ns.Y, Ns.Z channels. Denoising quality may suffer. HOT 2
- Double precision is not working... for normals? HOT 1
- Request: Optix SER support (requires Optix 8.0) HOT 1
- Increasing iterations for Hair BSDF
- Normals: Tangent space to world space transforms errors HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pbrt-v4.