Code Monkey home page Code Monkey logo

self-supervised-spare3d's Introduction

Self-supervised Spatial Reasoning on Multi-View Line Drawings [CVPR 2022]

Siyuan Xiang*, Anbang Yang*, Yanfei Xue,Yaoqing Yang, Chen Feng

We significantly improve SPARE3D baselines using self-supervised learning approaches.

Poster Page: https://ai4ce.github.io/Self-Supervised-SPARE3D/

ArXiv: Self-supervised Spatial Reasoning on Multi-View Line Drawings

Abstract

Spatial reasoning on multi-view line drawings by state-of-the-art supervised deep networks is recently shown with puzzling low performances on the SPARE3D dataset. Based on the fact that self-supervised learning is helpful when a large number of data are available, we propose two self-supervised learning approaches to improve the baseline performance for view consistency reasoning and camera pose reasoning tasks on the SPARE3D dataset. For the first task, we use a self-supervised binary classification network to contrast the line drawing differences between various views of any two similar 3D objects, enabling the trained networks to effectively learn detail-sensitive yet view-invariant line drawing representations of 3D objects. For the second type of task, we propose a self-supervised multi-class classification framework to train a model to select the correct corresponding view from which a line drawing is rendered. Our method is even helpful for the downstream tasks with unseen camera poses. Experiments show that our method could significantly increase the baseline performance in SPARE3D, while some popular self-supervised learning methods cannot.

Data

You can download the dataset via our google drive link. This google drive folder contains two files:

  1. contrastive_spatial_reasoning.7z, which contains "contrastive data" and "supervised data". "contrastive data" is for contrastive spatial reasoning method, "supervised data" is for fine tuning.
  2. In contrastive_model folder, you can find the trained model using our contrastive spatial reasoning method. Learning rate is 5e-05.

Dependencies & Our code

Requires Python3.x, PyTorch, PythonOCC. Running on GPU is highly recommended. The code has been tested on Python 3.8.5, PyTorch 1.8.0, with CUDA 11.1.

Task code

We significantly improve the SPARE3D task performance. Specifically, we design a contrastive spatial reasoning method for the T2I task. Code of three tasks(T2I, I2P, P2I) can be found under Tasks folder.

Run I2P_trainer.py with the parameters explained in args in the code. Our exploration experiments are under structure_explore folder. network_structure contains the controlled experiments on network structure, such as w/o adaptive pooling layer, dropout layer, fully connected layer, and whether use ImageNet pre-trained parameters. network_capacity contains the controlled experiments on width and depth of the baseline network.

Run P2I_trainer.pywith the parameters explained in args in the code.

contrastive_spatial_reasoning folder contains the code for contrastive spatial reasoning method. Run Three2I_trainer.py with parameters under Contrastive_learning folder. To fine tune the network, runThree2I_opt2_trainer.py with parameters under Fine_tune folder.

Attention map generation

Generate attention maps of the trained model using attention_map.py with image path and the root of trained model path.

To cite our paper:

@misc{xiang2021contrastive,
      title={Contrastive Spatial Reasoning on Multi-View Line Drawings}, 
      author={Siyuan Xiang and Anbang Yang and Yanfei Xue and Yaoqing Yang and Chen Feng},
      year={2021},
      eprint={2104.13433},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgment

The research is supported by NSF Future Manufacturing program under EEC-2036870. Siyuan Xiang gratefully thanks the IDC Foundation for its scholarship. We also thank the anonymous reviewers for constructive feedback.

self-supervised-spare3d's People

Contributors

endeleze avatar simbaforrest avatar siyuan2018 avatar yf-xue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

mkeshita

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.