Code Monkey home page Code Monkey logo

mobiler2l's People

Contributors

alanspike avatar galalalala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mobiler2l's Issues

Ask some questions

Hi, thank you for your perfect work. I have great interest in studying it. I would like to ask that my own nerf project uses torch_ngp related tools (gridencoder, raymarching, etc.). If I want to apply your method to my own nerf project, What key modules do I need to replace

About Distributed Training

Hello, first of all, thank you for open sourcing such a great project code. I would like to port your project to the Jetson platform. However, I have encountered a problem with the distributed training part of your code. Typically, distributed packages include gloo and nccl, but the Jetson platform can only use a specific version of PyTorch compiled by Nvidia. Unfortunately, the PyTorch provided by Nvidia for the Jetson does not include nccl. As a result, when I train the teacher, the module 'torch.distributed' has no attribute 'group', and torch.distributed.is_available() returns False.

I would like to ask you how to modify the code to block out the distributed computing content and only use a single GPU for training. Even when I set num_gpus=1, it still reports an error. I am looking forward to your reply. Thank you very much.

Some of my device and conda virtual environment info is as follows: python3.8, CUDA11.4, pytorch=1.13.0(Nvidia provides specific version), torchvision=0.14.1,torchaudio=0.13.1

Some of the error messages are shown below:
File "train.py", line 427, in
trainer.fit(system, ckpt_path=hparams.ckpt_path)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
call._call_and_handle_interrupt(
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
results = self._run_stage()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
self._run_train()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train
self.fit_loop.run()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.on_advance_end()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 251, in on_advance_end
self._run_validation()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 310, in _run_validation
self.val_loop.run()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 206, in run
output = self.on_run_end()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 180, in on_run_end
self._evaluation_epoch_end(self._outputs)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 288, in _evaluation_epoch_end
self.trainer._call_lightning_module_hook(hook_name, output_or_outputs)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1342, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "train.py", line 368, in validation_epoch_end
mean_psnr = all_gather_ddp_if_available(psnrs).mean()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/utilities/distributed.py", line 161, in all_gather_ddp_if_available
return new_all_gather_ddp_if_available(*args, **kwargs)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/lightning_lite/utilities/distributed.py", line 197, in _all_gather_ddp_if_available
group = group if group is not None else torch.distributed.group.WORLD
AttributeError: module 'torch.distributed' has no attribute 'group'

A little problem

When I use the command (python3 train.py --root_dir $ROOT_DIR/lego --exp_name lego --num_epochs 30 --batch_size 16384 --lr 2e-2 --eval_lpips --num_gpu 4 ) while training a teacher, the following error is reported:

Training: 0it [00:00, ?it/s]Traceback (most recent call last):
File "train.py", line 427, in
trainer.fit(system, ckpt_path=hparams.ckpt_path)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
results = self._run_stage()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
return self._run_train()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1283, in _run_train
self.fit_loop.run()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 195, in run
self.on_run_start(*args, **kwargs)
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 223, in on_run_start
self.trainer._call_lightning_module_hook("on_train_start")
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1550, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "train.py", line 201, in on_train_start
self.model.mark_invisible_cells(
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/nvidia/mydisk/Downloads/MobileR2L/model/teacher/ngp_pl/models/networks.py", line 213, in mark_invisible_cells
cells = self.get_all_cells()
File "/home/nvidia/anaconda3/envs/ngp_pl/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/nvidia/mydisk/Downloads/MobileR2L/model/teacher/ngp_pl/models/networks.py", line 164, in get_all_cells
indices = vren.morton3D(self.grid_coords).long()
RuntimeError: Unrecognized tensor type ID: PythonTLSSnapshot

My equipment is Jetson AGX Orin Developer Kit and Jetpack 5.1.2. My conda environment is configured as follows: CUDA11.4, CUDNN8.6.0, torch-1.13.0a0+d0d6b1f2.nv22.10+torchvision0.14.1+torchaudio0.13.1, python3.8.

android fps?

Have you tested the performance of android phones? If so, which backend is used?

pytorch_lightning.utilities.exceptions.MisconfigurationException: The provided lr scheduler `CosineAnnealingLR` doesn't follow PyTorch's LRScheduler API. You should override the `LightningModule.lr_scheduler_step` hook with your own logic if you are using a custom LR scheduler.

When I was training the Teacher, to running sh benchmarking/benchmark_synthetic_nerf.sh lego, I encountered the following problem:

Traceback (most recent call last):
File "/workspace/MobileR2L/model/teacher/ngp_pl/train.py", line 427, in
trainer.fit(system, ckpt_path=hparams.ckpt_path)
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1147, in _run
self.strategy.setup(self)
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/strategies/single_device.py", line 74, in setup
super().setup(trainer)
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 153, in setup
self.setup_optimizers(trainer)
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 141, in setup_optimizers
self.optimizers, self.lr_scheduler_configs, self.optimizer_frequencies = _init_optimizers_and_lr_schedulers(
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 194, in _init_optimizers_and_lr_schedulers
_validate_scheduler_api(lr_scheduler_configs, model)
File "/root/miniconda3/envs/ngp_pl/lib/python3.9/site-packages/pytorch_lightning/core/optimizer.py", line 351, in _validate_scheduler_api
raise MisconfigurationException(
pytorch_lightning.utilities.exceptions.MisconfigurationException: The provided lr scheduler CosineAnnealingLR doesn't follow PyTorch's LRScheduler API. You should override the LightningModule.lr_scheduler_step hook with your own logic if you are using a custom LR scheduler.

My CUDA version is 11.8, so I installed torch=2.0.0+cu11.8 in ngp_pl, and everything else is the same as README.

So how should I solve this problem?

iOS / Android SDK?

This is great work, which solves so many core problems outside of those directly of concern to Snap.

Is there any chance there is a lower-level iOS / Android codebase for neural rendering in a similar fashion (as was necessary to solve here, in some sense) which you guys used or developed, which you could point me to?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.