Code Monkey home page Code Monkey logo

editingfordnn's Introduction

3D Part Guided Image Editing for Fine-grained Object Understanding(CVPR2020)

CVPR 2020 Paper | Video

Zongdai Liu, Feixiang Lu, Peng Wang, Hui Miao, Liangjun Zhang, Ruigang Yang, Bin Zhou

Abstrct

Holistically understanding an object with its 3D movable parts is essential for visual models of a robot to interact with the world. For example, only by understanding many possible part dynamics of other vehicles (e.g., door or trunk opening, taillight blinking for changing lane), a self-driving vehicle can be success in dealing with emergency cases. However, existing visual models tackle rarely on these situations, but focus on bounding box detection. In this paper, we fill this important missing piece in autonomous driving by solving two critical issues. First, for dealing with data scarcity, we propose an effective training data generation process by fitting a 3D car model with dynamic parts to cars in real images. This allows us to directly edit the real images using the aligned 3D parts, yielding effective training data for learning robust deep neural networks (DNNs). Secondly, to benchmark the quality of 3D part understanding, we collected a large dataset in real driving scenario with cars in uncommon states (CUS), i.e. with door or trunk opened etc., which demonstrates that our trained network with edited images largely outperforms other baselines in terms of 2D detection and instance segmentation accuracy.

Requirements

  • python 3.6, cuda 9.2, pytorch 1.2.0, torchvision 0.4.0;
  • python-opencv, pycocotools

Inferring

python tool/infer.py --pretrained_model ./pretrained_model/state_rcnn_double_backbone.pth --input_dir ./demo/imgs --output_dir ./demo/res

The pretrained model can be downloaded at BaiduNetdisk (password:owov) or GoogleDrive

Training

The editing data totally 27k could be downloaded at BaiduNetdisk(password:kmve) Download the editing data and place a softlink (or the actual data) in EditingForDNN/editing_data/.

cd EditingForDNN
mkdir editing_data
ln -s /path/images ./editing_data/
ln -s /path/cus_editing_data.json ./editing_data/

Next download the main-backbone and aux-backbone pretrained models at BaiduNetdisk(password:fmkx) or GoogleDrive(main),GoogleDrive(aux) and put them in ./pretrained_model

Train model with 4 GPUs.

python -m torch.distributed.launch --nproc_per_node=4 --use_env tool/train.py

CUS Dataset

CUS Dataset will be released soon.

Concact

For questions regarding our work, feel free to post here or directly contact the authors ([email protected]).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.