Code Monkey home page Code Monkey logo

Comments (106)

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Just commenting to confirm when I remove PBRT_OPTIX7_PATH the code compiles fine.

from pbrt-v4.

wangchi87 avatar wangchi87 commented on September 17, 2024

I get the errors on VS2019 as well when CUDA and Optix is turned on:

  1. #include <sys/syscall.h> was not found
  2. MSB3721: “"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\nvcc.exe" -gencode=arch=compute_75,code="sm_75,compute_75" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.25.28610\bin\HostX64\x64" -x cu -rdc=true -I"C:\ProgramData\NVIDIA Corporation\OptiX SDK 7.1.0\include" -I"D:\code\pbrt-v4\src" -I"D:\code\pbrt-v4\build_gpu" -I"D:\code\pbrt-v4\src\ext\openvdb\nanovdb" -I"D:\code\pbrt-v4\src\ext" -I"D:\code\pbrt-v4\src\ext\stb" -I"D:\code\pbrt-v4\src\ext\openexr\IlmBase\Imath" -I"D:\code\pbrt-v4\src\ext\openexr\IlmBase\Half" -I"D:\code\pbrt-v4\src\ext\openexr\IlmBase\Iex" -I"D:\code\pbrt-v4\src\ext\openexr\OpenEXR\IlmImf" -I"D:\code\pbrt-v4\build_gpu\src\ext\openexr\IlmBase\config" -I"D:\code\pbrt-v4\build_gpu\src\ext\openexr\OpenEXR\config" -I"D:\code\pbrt-v4\src\ext\zlib" -I"D:\code\pbrt-v4\build_gpu\src\ext\zlib" -I"D:\code\pbrt-v4\src\ext\filesystem" -I"D:\code\pbrt-v4\src\ext\ptex\src\ptex" -I"D:\code\pbrt-v4\src\ext\double-conversion" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\include" -G --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -Xcudafe --diag_suppress=partial_override -Xcudafe --diag_suppress=virtual_function_decl_hidden -Xcudafe --diag_suppress=integer_sign_change -Xcudafe --diag_suppress=declared_but_not_referenced -Xcudafe --diag_suppress=implicit_return_from_non_void_function --expt-relaxed-constexpr --extended-lambda -Xnvlink -suppress-stack-size-warning --std=c++17 /wd4305 /wd4244 /wd4843 /wd4267 /wd4838 /wd26495 /wd26451 -Xcompiler="/EHsc -Ob2" -g -use_fast_math -D_WINDOWS -DNDEBUG -D_CRT_SECURE_NO_WARNINGS -DPBRT_IS_MSVC -DPBRT_BUILD_GPU_RENDERER -DNVTX -DPBRT_HAS_INTRIN_H -DPBRT_IS_WINDOWS -DNOMINMAX -D"PBRT_NOINLINE=__declspec(noinline)" -DPBRT_HAVE__ALIGNED_MALLOC -DPTEX_STATIC -D"CMAKE_INTDIR="Release"" -DWIN32 -D_WINDOWS -DNDEBUG -D_CRT_SECURE_NO_WARNINGS -DPBRT_IS_MSVC -DPBRT_BUILD_GPU_RENDERER -DNVTX -DPBRT_HAS_INTRIN_H -DPBRT_IS_WINDOWS -DNOMINMAX -D"PBRT_NOINLINE=__declspec(noinline)" -DPBRT_HAVE__ALIGNED_MALLOC -DPTEX_STATIC -D"CMAKE_INTDIR="Release"" -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Fdpbrt_lib.dir\Release\pbrt_lib.pdb /FS /Zi /MD /GR" -o pbrt_lib.dir\Release\samples.obj "D:\code\pbrt-v4\src\pbrt\gpu\samples.cpp"” returned code 1。 pbrt_lib C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\BuildCustomizations\CUDA 11.0.targets 772

from pbrt-v4.

wuyakuma avatar wuyakuma commented on September 17, 2024

nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified

Comment out those lines(79~85) in CMakeList.txt fix this error for me

list (APPEND PBRT_CXX_FLAGS /wd4305) # double constant assigned to float
list (APPEND PBRT_CXX_FLAGS /wd4244) # int -> float conversion
list (APPEND PBRT_CXX_FLAGS /wd4843) # double -> float conversion
list (APPEND PBRT_CXX_FLAGS /wd4267) # size_t -> int conversion
list (APPEND PBRT_CXX_FLAGS /wd4838) # double -> int conversion
list (APPEND PBRT_CXX_FLAGS /wd26495) # uninitialized member variable
list (APPEND PBRT_CXX_FLAGS /wd26451) # arithmetic on 4-byte value, then cast to 8-byte

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

I am admittedly not very good with Windows and (embarrassingly) don't have a Windows system with a GPU at hand at the moment. I made the very first issue, #1, on this issue for that reason. If we can together figure out how to get the windows+GPU build working, that'd be fantastic..

Note that if you don't define PBRT_OPTIX7_PATH, then you don't get GPU support. And what fun is that? :-)

I expect that the sys/syscall.h thing can be fixed by putting those #includes inside #ifndef PBRT_IS_WINDOWS checks. I can preemptively do that tomorrow (but can't confirm the fix.)

That makes sense that commenting out those lines helps, @wuyakuma. I can definitely see that the CUDA compiler isn't going to think those make sense. I can also try to fix that on my side, to pass those to MSVC but not to NVCC. With that fix, does it build and run on the GPU for you?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Removing the warning ignores got me farther but I'm still seeing some hard errors in the GPU code. I'll do another build tomorrow and update this thread

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

from pbrt-v4.

jiangwei007 avatar jiangwei007 commented on September 17, 2024

When i'm compiling GPU version of PBRT on windows, visual studio 2019 ,i got this : error #349: no operator "=" matches these operands
operand types are: pbrt::RGB = COLORREF

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Here's what I'm seeing:
First error:

C:/cygwin64/home/goodin/pbrt-v4/src\pbrt/util/spectrum.cpp(255): error #42: operand types are incompatible ("pbrt::RGB" and "COLORREF")

Here's the code:

RGBSpectrum::RGBSpectrum(const RGBColorSpace &cs, const RGB &rgb)
: rgb(rgb), illuminant(&cs.illuminant) {
Float m = std::max({rgb.r, rgb.g, rgb.b});
scale = 2 * m;
rsp = cs.ToRGBCoeffs(scale ? rgb / scale : RGB(0, 0, 0)); <---
}

This fixes the compile but I'm not sure it is correct:
RGBSpectrum::RGBSpectrum(const RGBColorSpace &cs, const RGB &rgb)
: rgb(rgb), illuminant(&cs.illuminant) {
Float m = std::max({rgb.r, rgb.g, rgb.b});
// RMG
RGB black(0, 0, 0);
// RMG
scale = 2 * m;
// RMG
rsp = cs.ToRGBCoeffs(scale ? rgb / scale : black);
// RMG rsp = cs.ToRGBCoeffs(scale ? rgb / scale : RGB(0, 0, 0));
}

Second Error:
C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/pstd.h(86): error : incomplete type is not allowed
detected during instantiation of class "pstd::array<T, N> [with T=int, N=0]"
C:/cygwin64/home/goodin/pbrt-v4/src/pbrt/bxdfs.cpp(764): here

C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/pstd.h(86): error : incomplete type is not allowed
detected during instantiation of class "pstd::array<T, N> [with T=const float *, N=0]"
C:/cygwin64/home/goodin/pbrt-v4/src/pbrt/bxdfs.cpp(764): here

Here's the code, not sure what it doesn't like:

bxdfs.cpp
/* Construct NDF interpolant data structure */
brdf->ndf = Warp2D0(alloc, (float *)ndf.data.get(), ndf.shape[1], ndf.shape[0], {},
{}, false, false);

pstd.h:
template <typename T, int N>
class array {
public:
using value_type = T;
using iterator = value_type *;
using const_iterator = const value_type *;
using size_t = std::size_t;

array() = default;
PBRT_CPU_GPU
array(std::initializer_list<T> v) {
    size_t i = 0;
    for (const T &val : v)
        values[i++] = val;
}

PBRT_CPU_GPU
void fill(const T &v) {
    for (int i = 0; i < N; ++i)
        values[i] = v;
}

PBRT_CPU_GPU
bool operator==(const array<T, N> &a) const {
    for (int i = 0; i < N; ++i)
        if (values[i] != a.values[i])
            return false;
    return true;
}
PBRT_CPU_GPU
bool operator!=(const array<T, N> &a) const { return !(*this == a); }

PBRT_CPU_GPU
iterator begin() { return values; }
PBRT_CPU_GPU
iterator end() { return values + N; }
PBRT_CPU_GPU
const_iterator begin() const { return values; }
PBRT_CPU_GPU
const_iterator end() const { return values + N; }

PBRT_CPU_GPU
size_t size() const { return N; }

PBRT_CPU_GPU
T &operator[](size_t i) { return values[i]; }
PBRT_CPU_GPU
const T &operator[](size_t i) const { return values[i]; }

PBRT_CPU_GPU
T *data() { return values; }
PBRT_CPU_GPU
const T *data() const { return values; }

private:
T values[N] = {};
};

Third error:
C:/cygwin64/home/goodin/pbrt-v4/src/pbrt/gpu/pathintegrator.cpp(276): error : identifier "SYS_gettid" is undefined

C:/cygwin64/home/goodin/pbrt-v4/src/pbrt/gpu/pathintegrator.cpp(276): error : identifier "syscall" is undefined

The code:
#ifdef NVTX
nvtxNameOsThread(syscall(SYS_gettid), "DISPLAY_SERVER_COPY_THREAD");
#endif

Fourth error:
C:/cygwin64/home/goodin/pbrt-v4/src/pbrt/gpu/film.cpp(22): error : calling a host function("isnan ") from a device function(" const") is not allowed

C:/cygwin64/home/goodin/pbrt-v4/src/pbrt/gpu/film.cpp(22): error : identifier "isnan " is undefined in device code

The code:
// Compute final weighted radiance value
SampledSpectrum Lw = SampledSpectrum(pixelSampleState.L[pixelIndex]) *
pixelSampleState.cameraRayWeight[pixelIndex];
CHECK(!std::isnan(Lw[0]));

from pbrt-v4.

wuyakuma avatar wuyakuma commented on September 17, 2024

@mmp
Here is what I did:

After commenting out those warning, I got

nvcc fatal: Cannot find compiler 'cl.exe' in PATH

so I changed line 142 in CMakeLists.txt to

if (MSVC)
  execute_process (COMMAND nvcc -lcuda ${CMAKE_SOURCE_DIR}/cmake/checkcuda.cu -ccbin ${CMAKE_CXX_COMPILER} -o ${OUTPUTFILE})
else  ()
  execute_process (COMMAND nvcc -lcuda ${CMAKE_SOURCE_DIR}/cmake/checkcuda.cu -o ${OUTPUTFILE})
endif ()

and also the BUILD_SHARED_LIBS need to be set to ON

set (BUILD_SHARED_LIBS ON)

otherwise the IlmImf project will staticlib, which causes link errors

after this

add
#include <algorithm>
in launch.cpp to fix

D:/RayTracing/pbrt/pbrt-v4/src/pbrt/gpu/launch.cpp(42): error : namespace "std" has no member "min"
D:/RayTracing/pbrt/pbrt-v4/src/pbrt/gpu/launch.cpp(43): error : namespace "std" has no member "max"

and

D:\RayTracing\pbrt\pbrt-v4\src\pbrt/util/pstd.h(86): error : incomplete type is not allowed detected during instantiation of class "pstd::array<T, N> [with T=int, N=0]"
D:/RayTracing/pbrt/pbrt-v4/src/pbrt/bxdfs.cpp(765): here

I believe it's because MSVC doesn't support zero length array, which is ok for clang
using Warp2D0 = PiecewiseLinear2D<0>;
after changed <0> to <1>, it compiles, but maybe a template specialization is better I think

and also, add some #ifndef PBRT_IS_WINDOWS like you said

as for the RGB error,
just put

#pragma push_macro("RGB")
#undef RGB

at the front of spectrum.cpp, and
#pragma pop_macro("RGB")
at the end of spectrum.cpp

after all these, it compile, but with a lot link errors

2>libpbrt.lib(pbrt.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_000069f8_00000000_8_pbrt_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(log.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_38_tmpxft_00005f60_00000000_8_log_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(stats.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00005e04_00000000_8_stats_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(pstd.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00007d00_00000000_8_pstd_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(vecmath.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_000092c0_00000000_8_vecmath_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(sampling.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_000092b8_00000000_8_sampling_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(spectrum.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_00009b10_00000000_8_spectrum_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(transform.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_44_tmpxft_00009724_00000000_8_transform_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(scattering.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_45_tmpxft_00009314_00000000_8_scattering_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(bxdfs.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_000062b0_00000000_8_bxdfs_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(primes.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_00002fdc_00000000_8_primes_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(options.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_00006e18_00000000_8_options_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(filters.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_00007f3c_00000000_8_filters_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(color.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_000091e4_00000000_8_color_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(lights.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_00002914_00000000_8_lights_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(colorspace.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_45_tmpxft_0000940c_00000000_8_colorspace_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(math.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00009314_00000000_8_math_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(mesh.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00004c98_00000000_8_mesh_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(shapes.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_000092dc_00000000_8_shapes_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(lightsamplers.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_48_tmpxft_000093cc_00000000_8_lightsamplers_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(error.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_0000381c_00000000_8_error_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(samplers.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_00007f3c_00000000_8_samplers_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(sobolmatrices.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_48_tmpxft_00007754_00000000_8_sobolmatrices_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(bluenoise.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_44_tmpxft_00007288_00000000_8_bluenoise_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(pmj02tables.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_46_tmpxft_00009330_00000000_8_pmj02tables_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(film.cpp.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00007dc0_00000000_8_film_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(cameras.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_00008f04_00000000_8_cameras_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(init.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00009624_00000000_8_init_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(check.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00006b30_00000000_8_check_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(interaction.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_46_tmpxft_000066c8_00000000_8_interaction_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(rng.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_38_tmpxft_00008f98_00000000_8_rng_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(lowdiscrepancy.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_49_tmpxft_000060a0_00000000_8_lowdiscrepancy_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(noise.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00007f70_00000000_8_noise_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(textures.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_00008f04_00000000_8_textures_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(materials.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_44_tmpxft_00008ba8_00000000_8_materials_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)
2>libpbrt.lib(bssrdf.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_0000277c_00000000_8_bssrdf_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@YAXXZ)

I am not familiar with cuda, so not sure what to do next then...

The attached file is a patch, in case anyone need it
pbrt-v4_MSVC_compile_fix.diff.txt

from pbrt-v4.

jiangwei007 avatar jiangwei007 commented on September 17, 2024

@mmp
When pbrt_exe project linking, there are some link errors:
libpbrt.lib(pbrt.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_000069f8_00000000_8_pbrt_cpp1_ii_27c0afcc referenced in function "void __cdecl __sti____cudaRegisterAll(void)" (?__sti____cudaRegisterAll@@yaxxz)

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I've just compiled with your latest commit from scratch. The path where the zlib static library is found appears to be off. The library is built in build\src\ext\zlib\Debug\zlibstatic.lib. It is referenced as:

------ Build started: Project: wtest, Configuration: Debug x64 ------
LINK : fatal error LNK1104: cannot open file '......\zlib\Debug\zlibstatic_d.lib'
LINK : fatal error LNK1104: cannot open file '......\zlib\Debug\zlibstatic_d.lib'

I don't know enough about where the project is building to figure out the difference.

Here's another error (I'm also getting a lot of the "host function" warnings:

C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/sampling.h(732): warning : calling a host function from a host device function is not allowed

C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/pstd.h(86): error : incomplete type is not allowed

I'm not seeing any other explicit errors. I'm getting 13 failures total. Most seem to be related to not finding zlib.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

(Status: I think that ToT includes all of the fixes that @wuyakuma listed, except for the shared libraries one.)

From a little googling, it looks like switching to shared libs is the cause of those missing __cudaRegisterLinkedBinary* symbols. In general, I've found it's best to avoid shared libs for pbrt, just because it's one more thing that can confuse students who are trying to use the system. In this case, when BUILD_SHARED_LIBS is set to ON, it also looks like the OpenEXR DLLs are going into a different directory than pbrt.exe, which makes it even more confusing. (So, my preference would be to figure out a non-shared lib approach, if that is possible.)

zlib and OpenEXR on windows have been a long-standing headache with the pbrt build. OpenEXR requires a zlib install, but we try to build it for the user if it isn't installed, just to make pbrt self-contained. An alternative could be to just require that people install zlib themselves; I'm wondering if that would take care of that zlibstatic issue you're seeing @richardmgoodin.

I'll take a look at those host/device function call warnings. There are a handful of them still on Linux, but they're all innocuous there.

Is there any more context on that "incomplete type" error?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I'm good with the non shared library approach. I can't believe that anyone running pbrt would have problems with a larger executable. I'm also OK with installing zlib separately. I already have to do installs of CUDA and OptiX so one more wouldn't be an issue for the GPU version.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Unfortunately I just did a recursive pull and got a new non building version of openEXR. This seems to be a consistent problem with their ToT in my very limited experience. That failure appears to be masking the others as I'm not seeing any other errors. Is there any way in Git I can pull a specific version of a recursive library?

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

An alternative could be to just require that people install zlib themselves

A third option could be using vcpkg (or Conan): it works on Windows, Linux and macOS, integrates with CMake, now has a manifest file (if you enable that feature on your computer, it will automatically download the listed dependencies before the rest of CMakeLists.txt runs), but it does not have versionning support yet (it might actually come in a month or two) so you end up with the latest packaged version (which could be enough for PBRT).

Is there any way in Git I can pull a specific version of a recursive library?

If you cd src/ext/openexr, you can just run a git checkout to whatever you want.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

@richardmgoodin ext/openexr should be at:

commit 5cfb5dab6dfada731586b0281bdb15ee75e26782 (HEAD -> zlibstatic-export-workaround, origin/zlibstatic-export-workaround)

(That was actually me forking ToT OpenEXR a week or two ago to fix a Windows build break related to zlib, so maybe your zlibstatic issues are due to having done a sync to a newer version?

@pierremoreau ah, interesting--good pointer. I'll look at those more closely. On one hand, I'm hoping that we're almost there and that it's just another small fix or two and everything will work, so I'd rather not make big changes to how the ext/ stuff is handled if necessary. On the other hand, that might be the best long-term option.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Yes, that's the version I have. So I have been assuming that it was changes to OpenEXR that was breaking the build. I have also backed up to OpenEXR 2.3.5 which gives the same errors.

That was a long and twisty divergence. I just deleted and pulled a new version. Here is what I'm seeing it looks like I'm getting two blocks of errors. One linking and one having to do with RGB. Here's the linking problem. It looks like the function call parameters are getting munged.

34>libpbrt_d.lib(pbrt.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00000a40_00000000_8_pbrt_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
33> Creating library C:/cygwin64/home/goodin/pbrt-v4/build/Debug/pbrt_test.lib and object C:/cygwin64/home/goodin/pbrt-v4/build/Debug/pbrt_test.exp
34>libpbrt_d.lib(log.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_38_tmpxft_000034a4_00000000_8_log_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(stats.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_000031a0_00000000_8_stats_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(error.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_0000421c_00000000_8_error_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(pathintegrator.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_49_tmpxft_00004988_00000000_8_pathintegrator_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(init.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00002354_00000000_8_init_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(pstd.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_000042c4_00000000_8_pstd_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(color.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00004ba0_00000000_8_color_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(spectrum.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_00004b84_00000000_8_spectrum_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(mesh.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00001720_00000000_8_mesh_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(shapes.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_00001af4_00000000_8_shapes_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(colorspace.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_45_tmpxft_00003e20_00000000_8_colorspace_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(options.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_00001188_00000000_8_options_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(filters.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_00004bf8_00000000_8_filters_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(film.cpp.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00004764_00000000_8_film_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(cameras.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_00003978_00000000_8_cameras_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(lights.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_000049dc_00000000_8_lights_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(samplers.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_00002c54_00000000_8_samplers_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(textures.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_00001ac4_00000000_8_textures_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(check.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00004e2c_00000000_8_check_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(vecmath.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_000012d4_00000000_8_vecmath_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(math.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00004504_00000000_8_math_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(transform.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_44_tmpxft_000041f4_00000000_8_transform_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(materials.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_44_tmpxft_000025f8_00000000_8_materials_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(lightsamplers.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_48_tmpxft_000032ac_00000000_8_lightsamplers_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(launch.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_00003df4_00000000_8_launch_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(camera.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_00001dd4_00000000_8_camera_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(samples.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_42_tmpxft_00003014_00000000_8_samples_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(media.cpp.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00003dec_00000000_8_media_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(subsurface.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_45_tmpxft_000030dc_00000000_8_subsurface_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(surfscatter.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_46_tmpxft_00002b98_00000000_8_surfscatter_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(film.cpp.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_0000319c_00000000_8_film_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(accel.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_000032ac_00000000_8_accel_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(sampling.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_43_tmpxft_00004760_00000000_8_sampling_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(interaction.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_46_tmpxft_00004710_00000000_8_interaction_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(primes.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_000000fc_00000000_8_primes_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(rng.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_38_tmpxft_00001a1c_00000000_8_rng_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(lowdiscrepancy.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_49_tmpxft_00004c3c_00000000_8_lowdiscrepancy_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(pmj02tables.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_46_tmpxft_00002e70_00000000_8_pmj02tables_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(noise.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00002064_00000000_8_noise_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(scattering.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_45_tmpxft_00001fc0_00000000_8_scattering_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(bxdfs.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00003424_00000000_8_bxdfs_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(sobolmatrices.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_48_tmpxft_00002dac_00000000_8_sobolmatrices_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(bluenoise.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_44_tmpxft_00001fec_00000000_8_bluenoise_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>libpbrt_d.lib(bssrdf.obj) : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_00001b10_00000000_8_bssrdf_cpp1_ii_27c0afcc referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
34>C:\cygwin64\home\goodin\pbrt-v4\build\Debug\pbrt.exe : fatal error LNK1120: 45 unresolved externals

And the following (looks like RGB again):

32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1042,50): error C2440: 'initializing': cannot convert from 'initializer list' to 'std::vector<pbrt::RGB,std::allocatorpbrt::RGB>'
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1171,1): message : No constructor could take the source type, or constructor overload resolution was ambiguous
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1217,51): error C2679: binary '=': no operator found which takes a right-hand operand of type 'COLORREF' (or there is no acceptable conversion)
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/color.h(147,1): message : could be 'pbrt::RGB &pbrt::RGB::operator =(pbrt::RGB &&)'
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/color.h(147,1): message : or 'pbrt::RGB &pbrt::RGB::operator =(const pbrt::RGB &)'
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1217,51): message : while trying to match the argument list '(pbrt::RGB, COLORREF)'
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1219,61): error C2679: binary '=': no operator found which takes a right-hand operand of type 'COLORREF' (or there is no acceptable conversion)
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/color.h(147,1): message : could be 'pbrt::RGB &pbrt::RGB::operator =(pbrt::RGB &&)'
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt/util/color.h(147,1): message : or 'pbrt::RGB &pbrt::RGB::operator =(const pbrt::RGB &)'
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1219,61): message : while trying to match the argument list '(pbrt::RGB, COLORREF)'
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1222,27): warning C4244: 'initializing': conversion from 'pbrt::Float' to 'int', possible loss of data
32>C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\imgtool.cpp(1223,73): warning C4267: 'argument': conversion from 'size_t' to 'const _Ty', possible loss of data

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Just pushed a fix for the imgtool one--thanks. (I have no idea where RGB is getting defined and why it's only on the Windows+NVCC build...)

Are those __cudaRegisterLinkedBinary errors with a clean TOT, or have you set BUILD_SHARED_LIBS to ON, as per @wuyakuma's suggestion?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Those were with a clean ToT

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Ok, could you try adding:

set_property(TARGET pbrt_lib PROPERTY CUDA_RESOLVE_DEVICE_SYMBOLS ON)

Down around line 661 of the top-level CMakeLists.txt and rebuilding?

(Via https://stackoverflow.com/a/51566919, which reports that this is a Windows-only cmake bug, and it sure looks like the same symptoms...)

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I'm still seeing the the zlib static missing problem. This is on a clean tree. It also looks like you have a typo on you imgtool fix. Line 58 says "indef". If we can get zlib to link I think we are there. Do you want me to install zlib separately?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

38>Generating Code...
38>LINK : fatal error LNK1104: cannot open file 'src\ext\zlib\Debug\zlibstatic_d.lib'

It looks like it is looking in the src tree instead of the build directory

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Also when I look in the build tree I see zlibstatic.lib not zlibstatic_d.lib

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

If you wouldn't mind installing zlib, I'd be interested to hear if that makes a difference. (I'm not sure what's going on with those issues about looking for the wrong thing in the wrong place!)

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I tried installing zlib but apparently Cmake didn't find it and it still built but didn't find zlib. The log was identical to the previous run. I'm going to try to download the source and try it that way. Where can I get the source, there appears to be multiple hits on zlib online.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I found it installed in c:\Program Files (x86)\Gnuwin32. I expect that part of the problem is that its 32 bit. I'll see if I can figure out some way to build and install a 64 bit version.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I went into the build directory at build/src/ext/zlib/Debug and copied zlibstatic.lib to zlibstatic_d.lib. Then I rebuilt without cleaning. imgtool failed because it was looking for zlibstaatic_d.pdb and I didn't copy that. Here's a result of trying to run pbrt:

goodin@DESKTOP-2L7L3L4 ~/pbrt-v4/build/Debug
$ ./pbrt.exe --gpu d:\pbrt\pbrt-v4-scenes-master\sanmiguel\sanmiguel-realistic-courtyard.pbrt
pbrt version 4 (built Aug 28 2020 at 17:51:00)
[ 19532.000 20200828.232625 C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\pbrt.cp
p:199 ] VERBOSE Running debug build
*** DEBUG BUILD ***
Copyright (c)1998-2020 Matt Pharr, Wenzel Jakob, and Greg Humphreys.
The source code to pbrt (but not the book contents) is covered by the BSD Lice
nse.
See the file LICENSE.txt for the conditions of the license.
←[1m←[31mError←[0m: d:pbrtpbrt-v4-scenes-mastersanmiguelsanmiguel-realistic-cour
tyard.pbrt: The parameter is incorrect.

I can't find anywhere in the Cmake files where it is getting the "_d" suffix from. Any ideas?

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

I can't find anywhere in the Cmake files where it is getting the "_d" suffix from. Any ideas?

Usually that would be from set (CMAKE_DEBUG_POSTFIX "_d") but that line is commented out here.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Getting code! (It seems like.)

That's a good idea--trying to make it work manually--hopefully then, the back-port of fixes to cmake isn't too bad..

Interestingly, the string "The parameter is incorrect" doesn't appear anywhere in the pbrt source code, so I don't know where that's coming from. However, I suspect it's an issue with path separator conventions: can you try changing into the scenes/sanmiguel directory and running pbrt on just sanmiguel-realistic-courtyard.pbrt from there?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

It is running. I can see the curses data on the "Warning" message. I never get to "Rendering" with or without --gpu

$ ~/pbrt-v4/build/Debug/pbrt.exe --gpu sanmiguel-realistic-courtyard.pbrt
pbrt version 4 (built Aug 28 2020 at 17:51:00)
[ 15412.000 20200828.235719 C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\pbrt.cp
p:199 ] VERBOSE Running debug build
*** DEBUG BUILD ***
Copyright (c)1998-2020 Matt Pharr, Wenzel Jakob, and Greg Humphreys.
The source code to pbrt (but not the book contents) is covered by the BSD Lice
nse.
See the file LICENSE.txt for the conditions of the license.
←[1m←[31mWarning←[0m: Specified aperture diameter 0.010000001 is greater than ma
ximum possible 0.008756. Clamping it.

BTW, I get the same Warning message running v4 on my Mac.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Eventually:

[ 15412.000 20200829.000409 C:/cygwin64/home/goodin/pbrt-v4/src/pbrt/gpu/pathint
egrator.cpp:541 ] FATAL CUDA error: invalid device ordinal
(C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\util\check.cpp) 0x00007FF76D3AC0
50 - pbrt::PrintStackTrace + line 120
(C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\util\check.cpp) 0x00007FF76D3AC4
10 - pbrt::CheckCallbackScope::Fail + line 148
(C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\util\log.cpp) 0x00007FF76CEB2890 - pbr
t::LogFatal + line 177
(C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\util\log.h) 0x00007FF76CE8BF70 - pbr
t::LogFatal<char const *> + line 112
(C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\gpu\pathintegrator.cpp) 0x00007F
F76CFABE60 - pbrt::GPURender + line 546
(C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\pbrt.cpp) 0x00007FF76CE70A10 - mai
n + line 238
(D:\agent_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl)
0x00007FF76D876A80 - invoke_main + line 79
(D:\agent_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl)
0x00007FF76D876830 - __scrt_common_main_seh + line 288
(D:\agent_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl)
0x00007FF76D876810 - __scrt_common_main + line 331
(D:\agent_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp) 0x00007F
F76D876B40 - mainCRTStartup + line 17
(unknown ) 0x00007FFE53BF6FC0 - BaseThreadI
nitThunk
(unknown ) 0x00007FFE54A3CEA0 - RtlUserThre
adStart

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

That's super exciting! (And "getting close" is what I meant to type earlier..) It's big progress to be linking (with some hacking) and at least getting started.

But that's an error I haven't seen before; it seems to be related to trying to talk to a non-existent GPU. Do you have multiple GPUs or anything like that that may not be well-handled (currently)?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I think I found the "_d" problem. If you look at the file build/CMakeCache.txt:build/CMakeCache.txt:CMAKE_DEBUG_POSTFIX:STRING=_d
In src/ext/zlib/CMakeLists.txt:
src/ext/zlib/CMakeLists.txt: #set(CMAKE_DEBUG_POSTFIX "d")

I'm not sure how the files relate but it sure seems suspicious. zlib is definitely building without the "_d"

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

This is without --gpu. I'll let it run and see if I get an image. It also seems to be running really slowly on Windows. I would expect my Windows machine to be roughly comparable to my Mac.

$ ~/pbrt-v4/build/Debug/pbrt.exe sanmiguel-realistic-courtyard.pbrt
pbrt version 4 (built Aug 28 2020 at 17:51:00)
[ 9392.000 20200829.000611 C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\cmd\pbrt.cpp
:199 ] VERBOSE Running debug build
*** DEBUG BUILD ***
Copyright (c)1998-2020 Matt Pharr, Wenzel Jakob, and Greg Humphreys.
The source code to pbrt (but not the book contents) is covered by the BSD Lice
nse.
See the file LICENSE.txt for the conditions of the license.
←[1m←[31mWarning←[0m: Specified aperture diameter 0.010000001 is greater than ma
ximum possible 0.008756. Clamping it.
Rendering: [ ] (93.6s|201103.0s)

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Maybe the Mac version is a release build, compared to the debug one you seem to be running on Windows?

Trying to see if I can also get it compiling on Windows, but keeping on getting errors that seem to evolve over time…

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I do build all my CUDA code using Visual Studio 2019 but the code I have is much less complex than PBRT and doesn't use shared memory.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

OK, I just checked I'm definitely not running a debug build on my Mac

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I need to stop for this evening. This is great progress. Hopefully the non gpu version will produce a good image and then we can limit our investigation to the GPU code. I'll update the thread when the non gpu version finishes with the results.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Which would make sense: PBRT will default to a Release build if you do not specify anything, meanwhile VS ignores CMAKE_BUILD_TYPE and defaults to Debug (IIRC, the reason being that it supports multi-config, so it instead respects the value given when running build on the command line, e.g. cmake --build $build_folder --type Release or the config chosen inside the IDE via its dropdown).

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Sounds good--thanks for pushing this forward, @richardmgoodin, and let me know if I can help with anything, @pierremoreau...

(I really need to get a Windows system with a GPU set up, but I cannibalized my other one to switch to Linux...)

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Quick question @richardmgoodin, are you using CMake to generate a Visual Studio solution and open that in VS, or are you opening the folder directly in VS and using its built-in CMake support? I've been trying the latter as it usually worked well for me, but trying the former now and it seems to be progressing further so far.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Okay, using the CMake generated solution file worked for me: I now reached the same point as you, having PBRT compiled and running but throwing that "FATAL CUDA error: invalid device ordinal" error when trying to use the GPU support.
I did not run into any issues regarding zlib though.

I'll open a separate issue to track whatever issues there might be when using the built-in CMake support, but I'll look into that CUDA error first.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Well that's good in a way that you're hitting the same error.

I have a theory now (and things may get messy.) pbrt makes a lot of use of unified CPU/GPU memory, and that code is all about getting things synchronized over to the GPU. Now, I know that unified memory is different in some ways on Windows (vs Linux), so I'm wondering if you guys are running into that.

It might be interesting to comment out those lines (from cudaGetDevice down through mr->PrefetchToGPU) and see what happens. If it happens to run, then that's a good sign. (Though the first wave of rays will be slow as everything gets paged in on demand.) It has gotten pretty far at this point, which suggests that unified memory has been generally working, so maybe it's just a problem with the prefetching stuff?

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

@mmp The issue is that my GPU does not have the device attribute cudaDevAttrConcurrentManagedAccess set, which is required according to the CUDA documentation (emphasis mine):

cudaMemAdviseSetPreferredLocation: This advice sets the preferred location for the data to be the memory belonging to device. Passing in cudaCpuDeviceId for device sets the preferred location as host memory. If device is a GPU, then it must have a non-zero value for the device attribute cudaDevAttrConcurrentManagedAccess

I tried the following (in src/pbrt/gpu/pathintegrator.cpp)

   int hasConcurrentManagedAccess;
    cudaDeviceGetAttribute(&hasConcurrentManagedAccess,
                           cudaDevAttrConcurrentManagedAccess, deviceIndex);
    if (hasConcurrentManagedAccess) {
        CUDA_CHECK(cudaMemAdvise(integrator, sizeof(*integrator),
                                 cudaMemAdviseSetReadMostly, 0));
        CUDA_CHECK(cudaMemAdvise(integrator, sizeof(*integrator),
                                 cudaMemAdviseSetPreferredLocation, deviceIndex));
    }

but then I instead run into

[ 3984.000 20200829.020348 N:\Users\Liara\source\code\pbrt-v4\src\pbrt\util\memory.cpp:106 ] FATAL CUDA error: invalid device ordinal
(N:\Users\Liara\source\code\pbrt-v4\src\pbrt\util\check.cpp)    0x00007FF71135C6B0 - pbrt::PrintStackTrace + line 121
(N:\Users\Liara\source\code\pbrt-v4\src\pbrt\util\check.cpp)    0x00007FF71135C580 - pbrt::CheckCallbackScope::Fail + line 148
(N:\Users\Liara\source\code\pbrt-v4\src\pbrt\util\log.cpp)      0x00007FF711015060 - pbrt::LogFatal + line 177
(N:\Users\Liara\source\code\pbrt-v4\src\pbrt\util\log.h)        0x00007FF711005FB0 - pbrt::LogFatal<char const *> + line 112
(N:\Users\Liara\source\code\pbrt-v4\src\pbrt\util\memory.cpp)   0x00007FF7110440E0 - pbrt::CUDATrackedMemoryResource::PrefetchToGPU + line 105
(N:\Users\Liara\source\code\pbrt-v4\src\pbrt\gpu\pathintegrator.cpp)    0x00007FF7110D0090 - pbrt::GPURender + line 558
(N:\Users\Liara\source\code\pbrt-v4\src\pbrt\cmd\pbrt.cpp)      0x00007FF711002B70 - main + line 238
(D:\agent\_work\9\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl)      0x00007FF7115A891C - __scrt_common_main_seh + line 288
(unknown                                 )      0x00007FF9532F7BC0 - BaseThreadInitThunk
(unknown                                 )      0x00007FF95460CE30 - RtlUserThreadStart

Did you see what I posted to the thread about the “_d” in the two CMakefiles

I saw that and it is a bit weird cause the commented out line says "d" and not "_d", and I did not run into that issue. I wonder if you somehow got a stale CMakeCache.txt file but no idea where that "_d" originally came from.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Ah, just saw your comment regarding mr, so now with the following

    GPUPathIntegrator *integrator =
        gpuMemoryAllocator.new_object<GPUPathIntegrator>(gpuMemoryAllocator, scene);

    int deviceIndex;
    CUDA_CHECK(cudaGetDevice(&deviceIndex));
    int hasConcurrentManagedAccess;
    cudaDeviceGetAttribute(&hasConcurrentManagedAccess,
                           cudaDevAttrConcurrentManagedAccess, deviceIndex);

    if (hasConcurrentManagedAccess) {
        // Set things up so that we can still have read from the
        // GPUPathIntegrator struct on the CPU without hurting
        // performance. (This makes it possible to use the values of things
        // like GPUPathIntegrator::haveSubsurface to conditionally launch
        // kernels according to what's in the scene...)

        CUDA_CHECK(cudaMemAdvise(integrator, sizeof(*integrator),
                                 cudaMemAdviseSetReadMostly, 0));
        CUDA_CHECK(cudaMemAdvise(integrator, sizeof(*integrator),
                                 cudaMemAdviseSetPreferredLocation, deviceIndex));

        // Copy all of the scene data structures over to GPU memory.  This
        // ensures that there isn't a big performance hitch for the first batch
        // of rays as that stuff is copied over on demand.
        CUDATrackedMemoryResource *mr =
            dynamic_cast<CUDATrackedMemoryResource *>(gpuMemoryAllocator.resource());
        CHECK(mr != nullptr);
        mr->PrefetchToGPU();
    }

I get

Exception thrown at 0x00007FF720C60ED4 in pbrt.exe: 0xC0000006: In page error reading location 0x0000001600052916 (status code 0xC0000022).

from GPUPathIntegrator::GenerateCameraRays(int y0, int sampleIndex).

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Sooo

src\ext\openexr\IlmBase\config\IlmBaseSetup.cmake:53:set(CMAKE_DEBUG_POSTFIX "_d" CACHE STRING "Suffix for debug builds")
src\ext\openexr\OpenEXR\config\OpenEXRSetup.cmake:47:set(CMAKE_DEBUG_POSTFIX "_d" CACHE STRING "Suffix for debug builds")
src\ext\openexr\OpenEXR_Viewers\config\OpenEXRViewersSetup.cmake:27:set(CMAKE_DEBUG_POSTFIX "_d" CACHE STRING "Suffix for debug builds")

I wonder if those are leaking into the main cache for whatever reason…

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Confirmed: if I comment out both lines, CMAKE_DEBUG_POSTFIX does not end up being set in the cache.

src\ext\openexr\IlmBase\config\IlmBaseSetup.cmake:53:set(CMAKE_DEBUG_POSTFIX "_d" CACHE STRING "Suffix for debug builds")
src\ext\openexr\OpenEXR\config\OpenEXRSetup.cmake:47:set(CMAKE_DEBUG_POSTFIX "_d" CACHE STRING "Suffix for debug builds")

As far as I understand the documentation for set() in the context of setting a cache variable, there is no scope. It looks like this is a bug on OpenEXR side and they should not be setting an internal CMake variable into the cache as it will mess things up for everyone. Instead they should define their own cache variable (for example OPENEXR_DEBUG_POSTFIX), and use set (CMAKE_DEBUG_POSTFIX "$CACHE{OPENEXR_DEBUG_POSTFIX}") as that will only set it within the current scope and will not impact other libraries.

from pbrt-v4.

jiangwei007 avatar jiangwei007 commented on September 17, 2024

Compiling GPU version of PBRT on windows, GPU device is GTX1070, there is a error: 14>H:\NVIDIA Corporation\GPU Computing Toolkit\CUDA\v11.0\include\cuda\std\detail/__atomic(10): fatal error C1189: #error: "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows."
it can add "set (ARCH "sm_70")" to CMakeList.txt on line:145 to resolving the issue.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Compiling GPU version of PBRT on windows, GPU device is GTX1070, there is a error: 14>H:\NVIDIA Corporation\GPU Computing Toolkit\CUDA\v11.0\include\cuda\std\detail/__atomic(10): fatal error C1189: #error: "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows."
it can add "set (ARCH "sm_70")" to CMakeList.txt on line:145 to resolving the issue.

I do not know how well that will work at runtime, considering a GTX 1070 is sm_61 which is less than the minimum version required on Windows.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

I didn't realize that the new cuda::atomic stuff wasn't available with sm_60 GPUs on Windows! I've just pushed a fix that falls back to regular CUDA atomics in that case; hopefully that will fix that error for you, @jiangwei007.

from pbrt-v4.

jiangwei007 avatar jiangwei007 commented on September 17, 2024

OK!Thanks! @mmp @pierremoreau

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

The Debug build seems to be running fine, which is simultaneously awesome and sad cause I was hoping to gain some insight as to why the RelWithDebInfo was throwing an exception when trying to dispatch the camera rays generation. It looked like either some memory corruption happened along the way, or the GPUPathIntegrator allocated in void GPURender(ParsedScene &scene) was completely bogus.
I will continue looking into this.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I just pulled ToT as of about 1:00 PM EDT. I'm building it debug and will try to run with --gpu. I pulled a previous version at about 10:00 am EDT and built it release. It rendered SanMiguel correctly in non-gpu mode. It exits silently with "--gpu" set after print the "Rendering [" message. I'll see if debug gives me more useful information. Since things are building for both release and debug do you want me to open a new issue?

Debug is running with the latest ToT. I'll rebuild with release and try again.

from pbrt-v4.

jiangwei007 avatar jiangwei007 commented on September 17, 2024

When i'm running PBRT with --gpu ,its looks like very low utiliztion rate of the GPU, near 1%~2%

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

jiangwei007 avatar jiangwei007 commented on September 17, 2024

@richardmgoodin i'm running debug version, the release version can not be worked..

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Same as Richard, I was seeing my GPU (RTX 2080 Ti) peak at 60%+ in the debug build. Maybe try adding --log-level verbose to get more information, and check there that it is indeed using the GPU.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

With the debug build, it does a cudaDeviceSync() after each kernel launch, so I'm sure that's killing performance and making utilization be bad like that. (The idea was that it's easier to tell which kernel is crashing, when there's a crash...) You could comment that out at line 78 of gpu/launch.h, but presumably perf would still be pretty bad since nothing was optimized. That should help with utilization at least.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

(With release builds when everything's working, utilization is typically extremely good.)

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

The debug version produces a correct image with san-migeul. It looks like it is hanging on printing the Statistics.

←[1m←[31mWarning←[0m: Specified aperture diameter 0.010000001 is greater than ma
ximum possible 0.008756. Clamping it.
Rendering: [+++++++++++++++++++++++++++++++++++++++++++] (4137.7s)
GPU Kernel Profile:
Generate Camera rays 2048 launches 620651.19 ms
/ 22.5% (avg 303.052, min 297.700, max 357.692)
Generate ray samples - HaltonSampler 12288 launches 89291.75 ms
/ 3.2% (avg 7.267, min 0.933, max 19.834)
Tracing closest hit rays 12288 launches 236322.84 ms
/ 8.6% (avg 19.232, min 6.079, max 43.161)
Handle escaped rays 12288 launches 16068.27 ms
/ 0.6% (avg 1.308, min 0.424, max 4.825)
Handle emitters hit by indirect rays 12288 launches 4733.81 ms
/ 0.2% (avg 0.385, min 0.370, max 1.151)
CoatedDiffuseMaterial + BxDF Eval (Basic tex) 10240 launches 675164.31 ms
/ 24.5% (avg 65.934, min 22.684, max 175.502)
DielectricMaterial + BxDF Eval (Basic tex) 10240 launches 26517.63 ms
/ 1.0% (avg 2.590, min 1.671, max 8.864)
DiffuseMaterial + BxDF Eval (Basic tex) 10240 launches 776855.13 ms
/ 28.1% (avg 75.865, min 4.810, max 178.483)
DiffuseTransmissionMaterial + BxDF Eval (Basic tex) 10240 launches 47183.78 m
s / 1.7% (avg 4.608, min 3.924, max 8.072)
Tracing shadow rays 10240 launches 211155.38 ms
/ 7.6% (avg 20.621, min 7.315, max 41.847)
Incorporate shadow ray contribution 10240 launches 20649.59 ms
/ 0.7% (avg 2.017, min 0.731, max 6.732)
Handle medium transitions 10240 launches 6349.11 ms
/ 0.2% (avg 0.620, min 0.598, max 2.051)
Update Film 2048 launches 26193.49 ms
/ 0.9% (avg 12.790, min 12.527, max 16.808)
Other 36864 launches 3320.22 ms
/ 0.1% (avg 0.090)

Total GPU time: 2760456.00 ms

GPU Statistics:
Camera rays 1269124203
Indirect rays, depth 1 1229059028
Indirect rays, depth 2 1114725766
Indirect rays, depth 3 70637698
Indirect rays, depth 4 41398871
Indirect rays, depth 5 24221358
Shadow rays, depth 0 823733138
Shadow rays, depth 1 458363989
Shadow rays, depth 2 541905019
Shadow rays, depth 3 26554950
Shadow rays, depth 4 19051550

Statistics:

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Here's a verbose log that seems to back up what you are seeing in terms of failing on the first pixel launch.
log.zip

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Here's a debug build logged for contrast. I'm seeing a lot of:

[ 4288.000 20200829.200059 C:\cygwin64\home\goodin\pbrt-v4\src\pbrt\util\memory.cpp:93 ] VERBOSE Ignoring dealloc of 8 bytes

in the release log that's not in the debug log. Is this significant?

logdbg.zip

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Looks like the allocation is fine, but the GPUDo("Reset ray queue", […]); in pathintegrator.cpp:GPUPathIntegrator::Render() is completely messing up with the memory:

image
When looking at memory starting at the address stored in the this pointer, before the GPUDo()

image
The same memory region after the GPUDo().

If I have the following log LOG_VERBOSE("Sampler: [ TaggedPointer ptr: 0x%p tag: %d ]", sampler.ptr(), sampler.Tag()); right before the GPUDo(), I see

[ 11480.000 20200829.214924 N:/Users/Liara/source/code/pbrt-v4/src/pbrt/gpu/pathintegrator.cpp:352 ] VERBOSE Sampler: [ TaggedPointer ptr: 0x000000020E4A4E00 tag: 1 ]

and if I perform a second, exactly similar log right after the GPUDo(), it triggers a read access violation.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

@richardmgoodin those warnings are fine. (I should get rid of them--they're not really meaningful now.) Thanks for posting all of that information.

On my system here (linux, 2080 RTX GPU), in release mode, that scene renders in 62.5 seconds. I just kicked off a run with a debug build, and it's trending toward about 2800s, so that seems the same general range of what you saw with a debug build on windows.

22% of time generating camera rays is surprisingly high; it's 9.9% on a release build for me here. (We'll see about deubg, when that finishes.) I wonder if that's a place that's getting caught up with both CPU and GPU accessing the same memory, which can hurt performance without that cudaMemAdvise hint that Pierre had to disable.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

@pierremoreau Weird! I've never seen anything like that, so don't have any guesses off the top of my head. That is the very first kernel that the path tracer launches (other than the stuff OptiX launches to build the acceleration structures earlier), so I'm guessing it's a sign of a systemic issue rather than a problem with that kernel.

A random thought might be to add a printf("rayQueues[0] %p\n", rayQueues[0]) to that kernel and see if it's seeing the same value that the CPU does for rayQueues[0].

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Something not waiting and getting to memory before it is ready?

Yeah… I was wondering why I was not seeing the result from the added printf(), and if I throw in a cudaDeviceSynchronize() right after the GPUDo(), I do see the printf() and now it no longer crashes on the following log. (And I can confirm that the value read by the GPU matches what I see from the CPU.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Oh, and looking at the details of GPUParallelFor() (which is called by GPUDo()), guess what there is right after the launch, if NDEBUG is not defined? A cudaDeviceSynchronize(). Which explains why things work in Debug mode but not in Release.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

If I uncomment the cudaDeviceSynchronize() in GPUParallelFor() as well as all the ones found in accel.cpp, PBRT seems to run fine. I will have a look at the CUDA Programming Guide, cause it has been some time since I last did some CUDA and I never touched unified addressing.

I know that OpenCL has different levels of Shared Virtual Memory, some where the application needs to explicitely synchronise and some where it's not needed, and in the past (at least) two years there has been ongoing work in the Linux kernel to support the latter. I wonder if Windows is in a different situation and does not support the more advanced one.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

@mmp It looks like I guessed right:

From the CUDA Programming Guide, Section L.1.1. System Requirements:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher.

From the CUDA Programming Guide, Section L.2.2. Coherency and Concurrency:

Simultaneous access to managed memory on devices of compute capability lower than 6.x is not possible, because coherence could not be guaranteed if the CPU accessed a Unified Memory allocation while a GPU kernel was active. However, devices of compute capability 6.x on supporting operating systems allow the CPUs and GPUs to access Unified Memory allocations simultaneously via the new page faulting mechanism. A program can query whether a device supports concurrent access to managed memory by checking a new concurrentManagedAccess property.

That concurrentManagedAccess attribute is indeed not set on my GPU on Windows despite being of compute capability 7.5.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

(@pierremoreau see #20 (comment) re those synchronizations; they're just there so that the debugging logs of "I launched this kernel" and "this kernel returned" are accurate; they should be independent of the unified memory synchronization.)

On the details of Windows unified memory, thanks for digging that up! I will read through all that and figure out what I need to do to make things work with the more limited version of it. I am curious why the debug build works, however.

One other open question: back to those cmake settings that openexr does but shouldn't be doing globally--do I need to change something in the openexr fork that pbrt-v4 uses to get the build working, or does it build out of the box now?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Ah, now I understand (what I think your point was, @pierremoreau), as I've read through some more of the unified memory details.

On Windows (where it's the basic model), it is expected to get a seg fault if the CPU accesses unified memory at the same time as a GPU kernel is running. Currently, the code does that in lots of places. However, because the debug build has those gratuitious syncrhonizations, it so happens that those accesses aren't happening at the same time there. Thus, the debug build works and release fails.

So presumably, a release build with those syncs after each launch would presumably be a lot faster (but would have poor utilization due to no pipelining.)

Apologies if this is what you were getting at and I didn't get it at first.

After I finish reading the docs, I'll start digging into fixing the implementation to stop doing that...

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Hopefully release mode will work with this change, which added explicit syncs after each kernel launch on Windows. I don't know what to expect as far as performance, but it should help a lot that the GPU kernels are optimized.

And I will think about how to rewrite things so that this isn't necessary, but it'll take a little thinking to figure out a plan..

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Yep, they should be there in top-of-tree.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I pulled the tree about an hour ago. I'm still seeing a gpu failure at:

Rendering: [ ] (0.3s|?s)

and then it exits.

I enabled the CUDA_CHECK(cudaDeviceSynchronize()); lines in accel.cpp that were there in the debug version and it ran to completion. Here's some benchmarking:

16" MacBook Pro (8 core) 3264.6s
i7-7700 (4 core) 7468.0s
i7-7700 + GV100 Gpu 2073.4s

Almost 4x is not bad given the lack of pipelining.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

Here's a verbose log of the failure. Looks like it his failing at the same place as before.
log.zip

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Ah, good find on those missing synchronizations--sorry I missed those. Added now!

I was surprised that I only saw a ~7% slowdown when I added all those synchronizations to test their overhead on Linux. (So that San Miguel scene went from ~70s to render to ~75s. While that was on a 2080 GPU with RTX cores, most of the work isn't ray intersections, so I'd expect a GV100 to have really good performance--in the same ballpark.)

I suspect the issue gets back to lots of data being copied back and forth between CPU and GPU between kernel launches on Windows: CPU accesses something that lives on the GPU, it gets copied over, then a kernel is launched, and it has to be copied back (even though it wasn't modified.)

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

@mmp This is indeed what I was getting at, but I should have explained it better rather than just quote the programming guide. I will try the updated version later today, and then do a comparison with Linux as I have a dual boot with Arch.

You might want to add to the code something like the following in src/pbrt/gpu/init.cpp, GPUInit() after the device selection, to fail hard and early if we find an unsupported configuration:

    int hasUnifiedAddressing;
    CUDA_CHECK(
        cudaDeviceGetAttribute(&hasUnifiedAddressing, cudaDevAttrUnifiedAddressing, device));
    if (!hasUnifiedAddressing) {
        LOG_FATAL("The selected device (%d) does not support unified addressing.", device);
    }

    // On Windows we perform additional synchronisation to work around the lack of
    // concurrent managed access as this is a platform-wide issue and even occurs for
    // hardware that does support it.
#if !defined(PBRT_IS_WINDOWS)
    int hasConcurrentManagedAccess;
    CUDA_CHECK(cudaDeviceGetAttribute(&hasConcurrentManagedAccess,
                                      cudaDevAttrConcurrentManagedAccess,
                                      device));
    if (!hasConcurrentManagedAccess) {
        LOG_FATAL("The selected device (%d) does not support concurrent managed access.",
                  device);
    }
#endif

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

@mmp I knew I was forgetting to reply to one of your comments… Regarding OpenEXR, I will open an issue on their side and ask if they can change their behaviour, but I don’t know what could be done in PBRT. Maybe delete the cache variable after dealing with OpenEXR? But on the other hand I did not run into any issues regarding it and zlib.
@richardmgoodin When you did your clean builds, did you ran into more issues regarding zlib?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

I did run into the problem early on but haven't seen it lately with fresh builds either Release or Debug.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

I believe I've figured out how to rewrite things without too much pain so that the CPU isn't accessing unified memory during rendering. I just pushed that in a branch, windows-gpu-rework, since it currently causes a 15-25% slowdown on Linux. However, if it works, performance on Windows should be much better. (I'll dig into that slowdown now to see what's going on.)

from pbrt-v4.

jiangwei007 avatar jiangwei007 commented on September 17, 2024

I'm running Release build of windows-gpu-reworks branch, on GTX1070, the utilization rate of GPU is near < 2% always, Is there something wrong what i do

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

Utilisation rate is a lot lower for me today (though I did went through a complete re-install of Windows from scratch, so hard to say what is different); Rendering killeroo-gold.pbrt with the default settings on the GPU (RTX 2080 Ti) is taking about 2h, and if Task Manager is to be trusted, GPU usage is at about 0.2% (though it is also saying it is currently using my other GPU). I need to compare against Linux later, and try the new branch.

from pbrt-v4.

mmp avatar mmp commented on September 17, 2024

Interesting... on Linux killeroo-gold renders for me in 13.8s on a RTX 2080. 13.8s / .002 = 6900s ~= 2 hours, so it looks like utilization is likely the entire problem here.

I don't have any good theories for what the cause might be though. Hmm.

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

I'll fire up Nsight Systems to have a look there; maybe perf should be tracked in a separate issue though.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

@richardmgoodin If you specify CUDACXX=path/to/nvcc as an environment variable before running CMake, does it find it? Otherwise try the solution from #23.

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

pierremoreau avatar pierremoreau commented on September 17, 2024

If you look at the output of nvcc --help for the nvcc you specified, does it include c++17 under the std option? Is it properly using a CUDA 11.0 install and not an earlier version?

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

olegded avatar olegded commented on September 17, 2024

@richardmgoodin You should update the PATH environment variable so nvcc v11 will be found before any other version. Then, in the same terminal run cmake (e.g. running cmake-gui in the same terminal should show you that all CUDA-related variables are automatically set to the right values at the first run)

export PATH=/usr/local/cuda/bin:$PATH #assuming /usr/local/cuda is where v11 is installed
which nvcc # should output  /usr/local/cuda/bin/nvcc
nvcc --version #should be v11

You could probably check the values of CUDA_TOOLKIT_INCLUDE (is /usr/local/cuda/include on my system) and CUDA_TOOLKIT_ROOT_DIR (is /usr/local/cuda) CMake variables.

Now, you should be able to compile it. At least, this is what I see on my end.

Oleg

from pbrt-v4.

richardmgoodin avatar richardmgoodin commented on September 17, 2024

from pbrt-v4.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.