postech-cvlab / perfception Goto Github PK
View Code? Open in Web Editor NEW[NeurIPS2022] Official implementation of PeRFception: Perception using Radiance Fields.
License: Apache License 2.0
[NeurIPS2022] Official implementation of PeRFception: Perception using Radiance Fields.
License: Apache License 2.0
what is the minimum hardware reuirements for running this demo
First of all, thank you for publishing your implementation.
I want to generate the ScanNet dataset using the learned weights.
For this, from the huggingface, I downloaded the files including last.ckpt.
Then, using the demo code, I tried to render the images of the first scene (scene0000_00).
For rendering without additional training or evaluation, I slightly modified the final block of scannet.gin as follows:
run.run_render = True
run.run_train = False
run.run_eval = False
After that, I run the demo code with
python -m run --ginc configs/scannet.gin --scene_name scene0000_00
However, when I run the demo code, it seems taking too much memory and returns the following message.
Unable to allocate array with shape (1210619520, 3) and data type float64
This issue also had been mentioned by #11.
The rendering loop (predict_step in /model/plenoxel_torch/model.py) seems to sequentially render the image tensors and keep all of them on RAM.
Maybe this part has better to be fixed for better accessibility of the dataset.
Anyway, in my case, I just picked one pose (frame_id=0) and rendered a single image.
The code runs without error, but it returns an unexpected result.
Fortunately, at least I can see the room-like shape (probably the room of scene0000_00, right?).
It seems that there is a pose-related problem.
The following (intermediate) pose tensors might be helpful for figuring out what is wrong.
original pose (before processing with pcd-related things)
[[[-9.554210e-01 1.196160e-01 -2.699320e-01 2.655830e+00]
[ 2.952480e-01 3.883390e-01 -8.729390e-01 2.981598e+00]
[ 4.080000e-04 -9.137200e-01 -4.063430e-01 1.368648e+00]
[ 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00]]]
render_pose (the finally returned one)
[[[-9.80858835e-01 2.35084399e-18 -1.94721569e-01 2.96767746e-01]
[-1.16803752e-07 9.99999718e-01 -7.10082718e-07 3.07291136e-02]
[ 1.94722179e-01 -1.46270149e-17 -9.80858767e-01 1.29165942e+00]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]]
I'm not very familiar with NeRF-related things, so the aforementioned trials might be wrong somewhere.
Any help would be greatly appreciated.
Thanks for this inspiring work! I am also interested in the data generation process on CO3D and ScanNet, do you have any plans on releasing it?
Thanks for this great dataset. this is really impressive.
I wondered if there is a way to download the representations for specific co3d sequences from specific categories.
does the chunks have anything to do with this?
First off, thank you for your great contribution.
Secondly, I have been exploring the PeRFception-ScanNet dataset, and I am trying to find the ground truth labels, but I am failing to do so.
If they are included in the folders, is there a location where we can find them?
If not, are there plans for you to release the ground truth files?
Additionally, since there is no documentation for the PeRFception-ScanNet, can you elaborate on the different numpy arrays (thick.npy and trans_info.npz)?
Thanks in advance for your help.
I believe the CO3d dataset was moved from one-drive to hugging face, however, I am unable to find a working download link for the PeRFceptionScanNet.
Can you provide a new download link? thank you in advance
Additionally what was the estimated time to produce the PeRFception-ScanNet given your setup. Thank you again
The demo command here
python3 -m run --ginc configs/co3d.gin
should be changed to
python3 -m run --ginc configs/co3d_v1.gin
since you have both CO3D V1 and V2 in the codebase now.
Another issue is here, where you have two identical lines of lambda_tv_background_color
. Do you want to set lambda_tv_background_sigma
here (since the default value is 1e-2
) or is it just a typo?
Hi
I would like to visualize one of your trained plenoxels. Ideally, I would want to just load a ckpt and render views from a spherical path around the center object. I would like to be able to do this without having to download co3d. However, I find this challenging to do with your current code.
I was able to load your model by using your on_load_checkpoint
that dequantize the checkpoints and load the model. Then I want to render views from this.
I decide on an intrinsic matrix:
near, far = 0., 1.
ndc_coeffs = (-1., -1.)
image_sizes = (200, 200)
focal = (100., 100.)
intrinsics = np.array(
[
[focal[0], 0.0, image_sizes[0]/2, 0.0],
[0.0, focal[1], image_sizes[1]/2, 0.0],
[0.0, 0.0, 1.0, 0.0],
[0.0, 0.0, 0.0, 1.0],
]
)
and use your function spherical_poses
to get the extrinsics
cam_trans = np.diag(np.array([-1, -1, 1, 1], dtype=np.float32))
render_poses = spherical_poses(cam_trans)
I then try to create the rays from the first pose using various of your functions
extrinsics_idx = render_poses[:1]
N_render = len(render_poses)
intrinsics_idx = np.stack(
[intrinsics for _ in range(N_render)]
)
image_sizes_idx = np.stack(
[image_sizes for _ in range(N_render)]
)
rays_o, rays_d = batchified_get_rays(
intrinsics_idx,
extrinsics_idx,
image_sizes_idx,
True,
)
rays_d = torch.tensor(rays_d, dtype=torch.float32)
rays_o = torch.tensor(rays_o, dtype=torch.float32)
rays_d = rays_d / torch.norm(rays_d, dim=-1, keepdim=True)
rays = torch.stack(
convert_to_ndc(rays_o, rays_o, ndc_coeffs), dim=1
)
rays_o = rays[:,0,:].contiguous()
rays_d = rays[:,1,:].contiguous()
rays_o = rays_o.to("cuda")
rays_d = rays_d.to("cuda")
and then try to render
rays = Rays(rays_o, rays_d)
grid = grid.to(device="cuda")
depth = grid.volume_render_depth(rays, 1e-5)
target = torch.zeros_like(rays_o)
rgb, mask = grid.volume_render_fused(rays, target)
but when I visualize the rendering it looks like I did something wrong:
depth = depth.reshape(200, 200)
rgb = rgb.reshape(200, 200, 3)
plt.imshow(depth.cpu().numpy())
plt.show()
plt.imshow(rgb.cpu().numpy())
plt.show()
Could you please help me? Would be very useful to check that the model loading is correct and to see how good the reconstructions are.
Is there anyone can visualize specific scene of ScanNet from check point provide by the repository https://huggingface.co/datasets/YWjimmy/PeRFception-ScanNet
Thanks for the great work.
I'm working on ubuntu20.04, cuda11.1, and followed your instruction in Get Ready.
But it still raises the error "ModuleNotFoundError: No module named 'cc3d'"
Am I doing wrong or missing something??
My pip list below.
Package Version
absl-py 1.4.0
aiohttp 3.8.4
aiosignal 1.3.1
appdirs 1.4.4
async-timeout 4.0.2
attrs 22.2.0
beautifulsoup4 4.11.2
cachetools 5.3.0
certifi 2022.12.7
charset-normalizer 3.0.1
click 8.1.3
ConfigArgParse 1.5.3
docker-pycreds 0.4.0
filelock 3.9.0
frozenlist 1.3.3
fsspec 2023.1.0
future 0.18.3
gdown 4.6.4
gin-config 0.5.0
gitdb 4.0.10
GitPython 3.1.31
google-auth 2.16.1
google-auth-oauthlib 0.4.6
grpcio 1.51.3
idna 3.4
imageio 2.26.0
imageio-ffmpeg 0.4.8
importlib-metadata 6.0.0
Markdown 3.4.1
MarkupSafe 2.1.2
multidict 6.0.4
networkx 3.0
numpy 1.24.2
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
oauthlib 3.2.2
opencv-python 4.7.0.72
packaging 23.0
pathtools 0.1.2
Pillow 9.4.0
pip 22.3.1
piqa 1.2.2
plenoxel 0.0.1.dev0+sphtexcub.lincolor.fast
plyfile 0.7.4
protobuf 4.22.0
psutil 5.9.4
pyasn1 0.4.8
pyasn1-modules 0.2.8
pyDeprecate 0.3.1
PySocks 1.7.1
pytorch-lightning 1.5.5
PyWavelets 1.4.1
PyYAML 6.0
requests 2.28.2
requests-oauthlib 1.3.1
rsa 4.9
scikit-image 0.19.3
scipy 1.10.1
sentry-sdk 1.15.0
setproctitle 1.3.2
setuptools 65.6.3
six 1.16.0
smmap 5.0.0
soupsieve 2.4
tensorboard 2.12.0
tensorboard-data-server 0.7.0
tensorboard-plugin-wit 1.8.1
tifffile 2023.2.3
torch 1.13.1
torchmetrics 0.11.1
torchvision 0.14.1
tqdm 4.64.1
typing_extensions 4.5.0
urllib3 1.26.14
wandb 0.13.10
Werkzeug 2.2.3
wheel 0.37.1
yarl 1.8.2
zipp 3.15.0
Hi, thanks for releasing this wonderful work. It's very helpful and I'm planning to apply it to some downstream tasks in my project. You've mentioned in README that
"We are planning to extend this work to PeRFception-CO3D-v2 from the CO3D-v2."
Since it's always better to use a higher quality dataset, I wonder do you have an estimated date for releasing this (if you plan to release)? Is it already training undergoing, or just "planned" so far? I completely understand training so many NeRFs takes a huge effort, and benchmarking everything is very time-consuming (actually I only need the trained Plenoxel weights). I just want to know the progress to better schedule my project. Thanks in advance!
python3 utils/download_perf.py --dataset co3d --outdir data/co3d/
exits without doing anything and looking at the script it is obvious why. Downloading chunks works as expected.
Running
python3 utils/download_perf.py --dataset co3d --outdir data/co3d/
creates txt files with error:
Sorry, there was a problem downloading some files from OneDrive. Please try again.
OneDrive_1 - https://onedrive.live.com/?id=60A1A318FA7A3606%211420&action=Download&authKey=!ACaUbVBSIuDvCrI
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.