Code Monkey home page Code Monkey logo

pionnerlc / gsnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lkeab/gsnet

0.0 0.0 0.0 15.32 MB

Official code of ECCV 2020 paper "GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision". GSNet performs joint vehicle pose estimation and vehicle shape reconstruction with single RGB image as input.

Home Page: https://arxiv.org/abs/2007.13124

Python 90.17% C++ 3.50% Cuda 6.31% Shell 0.02%

gsnet's Introduction

PWC PWC

GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision [ECCV'20]

Code and 3D car mesh models for the ECCV 2020 paper "GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision". GSNet performs joint vehicle pose estimation and vehicle shape reconstruction with single RGB image as input.

Abstract

We present a novel end-to-end framework named as GSNet (Geometric and Scene-aware Network), which jointly estimates 6DoF poses and reconstructs detailed 3D car shapes from single urban street view. GSNet utilizes a unique four-way feature extraction and fusion scheme and directly regresses 6DoF vehicle poses and shapes in a single forward pass. Extensive experiments show that our diverse feature extraction and fusion scheme can greatly improve model performance. Based on a divide-and-conquer 3D shape representation strategy, GSNet reconstructs 3D vehicle shape with great detail (1352 vertices and 2700 faces). This dense mesh representation further leads us to consider geometrical consistency and scene context, and inspires a new multi-objective loss function to regularize network training, which in turn improves the accuracy of 6D pose estimation and validates the merit of jointly performing both tasks.

Results on ApolloCar3D benchmark

(Check Table 3 of the paper for full results)

Method A3DP-Rel-mean A3DP-Abs-mean
DeepMANTA (CVPR'17) 16.04 20.1
3D-RCNN (CVPR'18) 10.79 16.44
Kpt-based (CVPR'19) 16.53 20.4
Direct-based (CVPR'19) 11.49 15.15
GSNet (ECCV'20) 20.21 18.91

Installation

We build GSNet based on the Detectron2 developed by FAIR. Please first follow its readme file. We recommend the Pre-Built Detectron2 (Linux only) version with pytorch 1.5 by the following command:

python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.5/index.html

Dataset Preparation

The ApolloCar3D dataset is detailed in paper ApolloCar3D and the corresponding images can be obtained from link. We provide our converted car meshes (same topology), kpts, bounding box, 3d pose annotations etc. in coco format under the car_deform_result and datasets/apollo/annotations folders.

Environment

Using Our Car mesh models

car_deform_result: We provide 79 types of ground truth car meshes with the same topology (1352 vertices and 2700 faces) converted using SoftRas (https://github.com/ShichenLiu/SoftRas)

The file car_models.py has a detailed description on the car id and car type correspondance.

merge_mean_car_shape: The mean car shape of the four shape basis used by four independent PCA models.

pca_components: The learned weights of the four PCA models.

Image of GSNet shape reconstruction

How to use our car mesh models? Please refer to the class StandardROIHeads in roi_heads.py, which contains the core inference code for ROI head of GSNet. It relies on the SoftRas to load and manipulate the car meshes.

Run GSNet

Please follow the readme page (including the pretrained model).

Citation

Please star this repository and cite the following paper in your publications if it helps your research:

@InProceedings{Ke_2020_ECCV,
    author = {Ke, Lei and Li, Shichao and Sun, Yanan and Tai, Yu-Wing and Tang, Chi-Keung},
    title = {GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year = {2020}
}

Related repo based on detectron2: BCNet, CVPR'21- a bilayer instance segmentation method

Related work reading: EgoNet, CVPR'21

License

A MIT license is used for this repository. However, certain third-party datasets, such as (ApolloCar3D), are subject to their respective licenses and may not grant commercial use.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.