Code Monkey home page Code Monkey logo

planetr3d's Introduction

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

This is the official implementation of our ICCV 2021 paper

News

  • 2023.2.9: Upload the depth evaluation code on the NYUV2 dataset.
  • 2021.08.08: The visualization code is available now. You can find it in 'disp.py'. A simple example of how to visualize the results is showed in 'eval_planeTR.py'.

TODO

  • Supplement 2D/3D visualization code.

Getting Started

Clone the repository:

git clone https://github.com/IceTTTb/PlaneTR3D.git

We use Python 3.6 and PyTorch 1.6.0 in our implementation, please install dependencies:

conda create -n planeTR python=3.6
conda activate planeTR
conda install pytorch=1.6.0 torchvision=0.7.0 torchaudio cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Data Preparation

We train and test our network on the plane dataset created by PlaneNet. We follow PlaneAE to convert the .tfrecords to .npz files. Please refer to PlaneAE for more details.

We generate line segments using the state-of-the-art line segment detection algorithm HAWP with their pretrained model. The processed line segments data we used can be downloaded here.

The structure of the data folder should be

plane_data/
  --train/*.npz
  --train_img/*
  --val/*.npz
  --val_img/*
  --train.txt
  --val.txt

Training

Download the pretrained model of HRNet and place it under the 'ckpts/' folder.

Change the 'root_dir' in config files to the path where you save the data.

Run the following command to train our network on one GPU:

CUDA_VISIBLE_DEVICES=0 python train_planeTR.py

Run the following command to train our network on multiple GPUs:

CUDA_VISIBLE_DEVICES=0,1,2 python -m torch.distributed.launch --nproc_per_node=3 --master_port 295025 train_planeTR.py

Evaluation on ScanNet

Download the pretrained model here and place it under the 'ckpts/' folder.

Change the 'resume_dir' in 'config_planeTR_eval.yaml' to the path where you save the weight file.

Change the 'root_dir' in config files to the path where you save the data.

Run the following command to evaluate the performance:

CUDA_VISIBLE_DEVICES=0 python eval_planeTR.py

Evaluation on NYUV2

To evaluate the depth on the NYUV2 dataset, please first download the original data from here and the official train/test split from here. Then, you also have to download our line detection results from here. After downloading all data, please put them under the folder 'nyudata/'. The structure of the data folder should be:

planeTR3D
  --nyudata
    |--nyu_depth_v2_labeled.mat
    |--splits.mat
    |--line_info\
        |--*.txt

Run the following command to evaluate the performance:

CUDA_VISIBLE_DEVICES=0 python eval_nyudepth.py

Unfortunately, I lost the original detected line segments used in the paper. Thus I regenerated the line segments on the NYUV2 dataset with the latest HAWPV3. The depth metrics with these regenerated line segments are:

Rel log10 RMSE d1 d2 d3
0.196 0.096 0.812 63.8 88.3 96.0
The new results are slightly different to the ones in the paper, but they do not influence the conclusion.

Citations

If you find our work useful in your research, please consider citing:

@inproceedings{tan2021planeTR,
title={PlaneTR: Structure-Guided Transformers for 3D Plane Recovery},
author={Tan, Bin and Xue, Nan and Bai, Song and Wu, Tianfu and Xia, Gui-Song},
booktitle = {International Conference on Computer Vision},
year={2021}
}

Contact

[email protected]

https://xuenan.net/

Acknowledgements

We thank the authors of PlaneAE, PlaneRCNN, interplane and DETR. Our implementation is heavily built upon their codes.

planetr3d's People

Contributors

icetttb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

planetr3d's Issues

Some evaluation results of planeTR (no line segments) model is different from that in your paper?

Hi,
I modified “config_planeTR_train.yaml”, set model.use_lines=False, model.loss_layer_num=1, and other items remained unchanged. After training, I evaluated the model and found that per-plane/pixel recalls with small error thresholds was very different from that in your paper. As shown in the pictures below.

image
image

Do you know the reason? Is there any other training configuration to be modified?

Thanks!

NYUv2 plane segmentation evaluation

Hi,

It seems that there is no NYU's segmentation evaluation in this repo? I found NYUv2 seg GT in PlaneAE issue (svip-lab/PlanarReconstruction#27, is that what you use?) and used your code for evaluation, but the results are somewhat different from table 1 of the paper. I'm not sure whether I overlooked some key details. Could you please release the code of nyuv2 seg eval and nyuv2 seg GT?

A lot of thanks!

inference on single image?

Hi, Thank you for the great work. Would you please share the inference script of planeTR? I would like to test your work on my own image. Thanks!

Looking for train.txt and val.txt

Congratulations on great work !
I am just looking for two files named train.txt and val.txt but failed. Could you please tell me where i can get them or the details of them?
Thanks a lot.

When loss_layer_num>1, why just use one layer for the auxiliary loss?

Thanks for your good works!
loss_layer_num is set as 3 in the config. But only layer[1] and layer[-1] are optimized?

if self.loss_layer_num > 1 and self.training:
aux_outputs = []
aux_l = {'pred_logits': plane_prob[1], 'pred_plane_embedding': plane_embedding[1],
'pixel_embedding': pixel_embedding}
if self.predict_center:
aux_l['pred_center'] = plane_center[1]
aux_outputs.append(aux_l)

How to get the planes_scannet_train.tfrecords?

Hi,

Thank you for such a great job!

I have troubles when downloading the planes_scannet_train.tfrecords from the link provided by PlaneNet. I just could not download the data. Another box link does not work (the file has been removed). So would you mind sharing your downloaded .tfrecords or preprocessed .npz files? Thanks a lot!

Number of lines

您好!请问对于直线检测阶段,对于不同的图片,检测的是处理成了固定数量的吗?这个数量设的是多少?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.