<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi, I meet a problem when evaluate the nerf; about torch-ngp HOT 7 CLOSED

ashawkey commented on July 17, 2024

Hi, I meet a problem when evaluate the nerf;

from torch-ngp.

Comments (7)

ashawkey commented on July 17, 2024

Hi, could you provide more details? e.g. a minimal reproducible code example?
From the screenshot, I can only guess that you seem to try to backward through composite_rays, which should only be used at inference?

from torch-ngp.

Dengzhi-USTC commented on July 17, 2024

I just run above script,

from torch-ngp.

ashawkey commented on July 17, 2024

@Dengzhi-USTC Are you using the latest version? I cannot reproduce the error...

$ python main_nerf.py data/fox/ --workspace trial_nerf --fp16 --ff --cuda_ray                                                                                       
Namespace(path='data/fox/', test=False, workspace='trial_nerf', seed=0, num_rays=4096, cuda_ray=True, num_steps=128, upsample_steps=128, max_ray_batch=4096, fp16=True, ff=True, tcnn=False, mode='colmap',
 preload=False, bound=2, scale=0.33, gui=False, W=800, H=800, radius=5, fovy=90, max_spp=64)                                                                                                               
NeRFNetwork(                                                                                                                                                                                               
  (encoder): HashEncoder: input_dim=3 num_levels=16 level_dim=2 base_resolution=16 per_level_scale=1.4472692374403782 params=(6328829, 2)                                                                  
  (sigma_net): FFMLP: input_dim=32 output_dim=16 hidden_dim=64 num_layers=2 activation=0                                                                                                                   
  (encoder_dir): SHEncoder: input_dim=3 degree=4                                                                                                                                                           
  (color_net): FFMLP: input_dim=32 output_dim=3 hidden_dim=64 num_layers=3 activation=0                                                                                                                    
)                                                                                                                                                                                                          
[INFO] Trainer: ngp | 2022-03-16_10-23-47 | cuda:0 | fp16 | trial_nerf                                                                                                                                     
[INFO] #parameters: 12676090                                                                                                                                                                               
[INFO] Loading latest checkpoint ...                                                                                                                                                                       
[WARN] No checkpoint found, model randomly initialized.                                                                                                                                                    
==> Start Training Epoch 1, lr=0.010000 ...                                                                                                                                                                
loss=0.0157 (0.0188): : 100% 49/49 [00:01<00:00, 40.71it/s]                                                                                                                                                
==> Finished Epoch 1.                                                                                                                                                                                      
==> Start Training Epoch 2, lr=0.010000 ...                                                                                                                                                                
loss=0.0207 (0.0228): : 100% 49/49 [00:01<00:00, 41.23it/s]                                                                                                                                                
==> Finished Epoch 2.                                                                                                                                                                                      
==> Start Training Epoch 3, lr=0.010000 ...                                                                                                                                                                
loss=0.0126 (0.0134): : 100% 49/49 [00:02<00:00, 21.19it/s]                                                                                                                                                
==> Finished Epoch 3.                                                                                                                                                                                      
==> Start Training Epoch 4, lr=0.010000 ...                                                                                                                                                                
loss=0.0071 (0.0076): : 100% 49/49 [00:02<00:00, 22.08it/s]                                                                                                                                                
==> Finished Epoch 4.                                                                                                                                                                                      
==> Start Training Epoch 5, lr=0.010000 ...                                                                                                                                                                
loss=0.0040 (0.0059): : 100% 49/49 [00:02<00:00, 21.86it/s]                                                                                                                                                
==> Finished Epoch 5.                                                                                                                                                                                      
==> Start Training Epoch 6, lr=0.010000 ...                                                                                                                                                                
loss=0.0045 (0.0046): : 100% 49/49 [00:02<00:00, 21.84it/s]                                                                                                                                                
==> Finished Epoch 6.                                                                                                                                                                                      
==> Start Training Epoch 7, lr=0.010000 ...                                                                                                                                                                
loss=0.0035 (0.0048): : 100% 49/49 [00:02<00:00, 21.85it/s]                                                                                                                                                
==> Finished Epoch 7.                                                                                                                                                                                      
==> Start Training Epoch 8, lr=0.010000 ...                                                                                                                                                                
loss=0.0034 (0.0042): : 100% 49/49 [00:02<00:00, 22.42it/s]                                                                                                                                                
==> Finished Epoch 8.                                                                                                                                                                                      
==> Start Training Epoch 9, lr=0.010000 ...                                                                                                                                                                
loss=0.0051 (0.0032): : 100% 49/49 [00:02<00:00, 21.91it/s]                                                                                                                                                
==> Finished Epoch 9.
==> Start Training Epoch 10, lr=0.010000 ...
loss=0.0017 (0.0029): : 100% 49/49 [00:02<00:00, 22.36it/s]
==> Finished Epoch 10.
++> Evaluate at epoch 10 ...
loss=0.0023 (0.0023): : 100% 1/1 [00:00<00:00,  1.18it/s]
PSNR = 22.404175
++> Evaluate epoch 10 Finished.
[INFO] New best result: None --> 0.0023169806227087975
==> Start Training Epoch 11, lr=0.010000 ...
loss=0.0022 (0.0027): : 100% 49/49 [00:02<00:00, 22.20it/s]

from torch-ngp.

Dengzhi-USTC commented on July 17, 2024

Yes, i download the latest version.
I try this on the v100 and 3090.
The inference part also generate those problem.

from torch-ngp.

Dengzhi-USTC commented on July 17, 2024

ok, i soved the problem, the vision of my torch is 1.08.

from torch-ngp.

Karbo123 commented on July 17, 2024

@ashawkey Hi, about this strange phenomenon, I've found a simple workaround.
This seems to be caused by not explicitly giving the return value in class _composite_rays(Function).

I think it is better to manually adding a return value (for example, return tuple()) to those functions not giving a return value, even if they aren't accessed at all. The same change should be applied to another function class _compact_rays(Function) similarly. I think this is because a lower pytorch exists this bug which is fixed in a higher version.

By the way, about another suggestion. I also notice that it'd be great if you could remove all the indexing='ij' in meshgrid function so that your codes are compatible with a lower pytorch version (notice that ij is already the default arguments and we could remove it).

from torch-ngp.

ashawkey commented on July 17, 2024

@Karbo123 Thanks for the solution! I will add it soon. As for the reason of using indexing='ij', it says the default behaviour will be changed in later pytorch versions. I'll write a condition to better support lower versions.

from torch-ngp.

Hi, I meet a problem when evaluate the nerf; about torch-ngp HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent