Code Monkey home page Code Monkey logo

iccv23-kecor-active-3ddet's Introduction


This work is the official Pytorch implementation of our ICCV publication: KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection.

[ICCV Page] [arXiv] [Supp]

News

πŸ”₯ 08/13 updates: under development. Checkpoints will be uploaded soon.

πŸ”₯ 03/18 updates: The checkpoints can be found via https://drive.google.com/drive/folders/1xbEI3tSfTCHIt3m8tk4hAy4qBJ55NuqL?usp=sharing

Citation

@inproceedings{DBLP:conf/iccv/LuoCF0BH23,
  author       = {Yadan Luo and
                  Zhuoxiao Chen and
                  Zhen Fang and
                  Zheng Zhang and
                  Mahsa Baktashmotlagh and
                  Zi Huang},
  title        = {Kecor: Kernel Coding Rate Maximization for Active 3D Object Detection},
  booktitle    = {{IEEE/CVF} International Conference on Computer Vision, {ICCV} 2023, Paris, France, October 1-6, 2023},
  pages        = {18233--18244},
  publisher    = {{IEEE}},
  year         = {2023}
}

Framework

Achieving a reliable LiDAR-based object detector in autonomous driving is paramount, but its success hinges on obtaining large amounts of precise 3D annotations. Active learning (AL) seeks to mitigate the annotation burden through algorithms that use fewer labels and can attain performance comparable to fully supervised learning. Although AL has shown promise, current approaches prioritize the selection of unlabeled point clouds with high aleatoric and/or epistemic uncertainty, leading to the selection of more instances for labeling and reduced computational efficiency. In this paper, we resort to a novel kernel coding rate maximization (KECOR) strategy which aims to identify the most informative point clouds to acquire labels through the lens of information theory. Greedy search is applied to seek desired point clouds that can maximize the minimal number of bits required to encode the latent features. To determine the uniqueness and informativeness of the selected samples from the model perspective, we construct a proxy network of the 3D detector head and compute the outer product of Jacobians from all proxy layers to form the empirical neural tangent kernel (NTK) matrix. To accommodate both one-stage (i.e., SECOND) and two-stage detectors (i.e., PV-RCNN), we further incorporate the classification entropy maximization and well trade-off between detection performance and the total number of bounding boxes selected for annotation. Extensive experiments conducted on two 3D benchmarks and a 2D detection dataset evidence the superiority and versatility of the proposed approach. Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art AL method, without compromising detection performance.


Contents

Installation

Requirements

All the codes are tested in the following environment:

Install pcdet v0.5

Our implementations of 3D detectors are based on the lastest OpenPCDet. To install this pcdet library and its dependent libraries, please run the following command:

python setup.py develop

NOTE: Please re-install even if you have already installed pcdet previoursly.

Getting Started

The active learning configs are located at tools/cfgs/active-kitti_models and /tools/cfgs/active-waymo_models for different AL methods. The dataset configs are located within tools/cfgs/dataset_configs, and the model configs are located within tools/cfgs for different datasets.

Dataset Preparation

Currently we provide the dataloader of KITTI dataset and Waymo dataset, and the supporting of more datasets are on the way.

KITTI Dataset

  • Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows (the road planes could be downloaded from [road plane], which are optional for data augmentation in the training):
  • If you would like to train CaDDN, download the precomputed depth maps for the KITTI training set
  • NOTE: if you already have the data infos from pcdet v0.1, you can choose to use the old infos and set the DATABASE_WITH_FAKELIDAR option in tools/cfgs/dataset_configs/kitti_dataset.yaml as True. The second choice is that you can create the infos and gt database again and leave the config unchanged.
KECOR-active-3Ddet
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ kitti
β”‚   β”‚   │── ImageSets
β”‚   β”‚   │── training
β”‚   β”‚   β”‚   β”œβ”€β”€calib & velodyne & label_2 & image_2 & (optional: planes) & (optional: depth_2)
β”‚   β”‚   │── testing
β”‚   β”‚   β”‚   β”œβ”€β”€calib & velodyne & image_2
β”œβ”€β”€ pcdet
β”œβ”€β”€ tools
  • Generate the data infos by running the following command:
python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml

Waymo Open Dataset

  • Please download the official Waymo Open Dataset, including the training data training_0000.tar~training_0031.tar and the validation data validation_0000.tar~validation_0007.tar.
  • Unzip all the above xxxx.tar files to the directory of data/waymo/raw_data as follows (You could get 798 train tfrecord and 202 val tfrecord ):
KECOR-active-3Ddet
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ waymo
β”‚   β”‚   │── ImageSets
β”‚   β”‚   │── raw_data
β”‚   β”‚   β”‚   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
|   |   |── waymo_processed_data_v0_5_0
β”‚   β”‚   β”‚   │── segment-xxxxxxxx/
|   |   |   |── ...
β”‚   β”‚   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1/
β”‚   β”‚   │── waymo_processed_data_v0_5_0_waymo_dbinfos_train_sampled_1.pkl
β”‚   β”‚   │── waymo_processed_data_v0_5_0_gt_database_train_sampled_1_global.npy (optional)
β”‚   β”‚   │── waymo_processed_data_v0_5_0_infos_train.pkl (optional)
β”‚   β”‚   │── waymo_processed_data_v0_5_0_infos_val.pkl (optional)
β”œβ”€β”€ pcdet
β”œβ”€β”€ tools
  • Install the official waymo-open-dataset by running the following command:
pip3 install --upgrade pip
pip3 install waymo-open-dataset-tf-2-0-0==1.2.0 --user

Waymo version in our project is 1.2.0

  • Extract point cloud data from tfrecord and generate data infos by running the following command (it takes several hours, and you could refer to data/waymo/waymo_processed_data_v0_5_0 to see how many records that have been processed):
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \
    --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml

Note that you do not need to install waymo-open-dataset if you have already processed the data before and do not need to evaluate with official Waymo Metrics.

Training & Testing

Test and evaluate the pretrained models

The weights of our pre-trained model will be released upon acceptance.

  • Test with a pretrained model:
python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT}
  • To test all the saved checkpoints of a specific training setting and draw the performance curve on the Tensorboard, add the --eval_all argument:
python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all
  • To test with multiple GPUs:
sh scripts/dist_test.sh ${NUM_GPUS} \
    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE}

# or

sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_GPUS} \
    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE}

Train a backbone

In our active learning setting, the 3D detector will be pre-trained with a small labeled set $\mathcal{D}_L$ which is randomly sampled from the trainig set. To train such a backbone, please run

sh scripts/${DATASET}/train_${DATASET}_backbone.sh

Train with different active learning strategies

We provide several options for active learning algorithms, including

  • random selection [random]
  • confidence sample [confidence]
  • entropy sampling [entropy]
  • MC-Reg sampling [montecarlo]
  • greedy coreset [coreset]
  • learning loss [llal]
  • BADGE sampling [badge]
  • CRB sampling [crb]
  • Kecor sampling [kecor]

You could optionally add extra command line parameters --batch_size ${BATCH_SIZE} and --epochs ${EPOCHS} to specify your preferred parameters.

  • Train:
python train.py --cfg_file ${CONFIG_FILE}

Acknowledgement

Part of code for NTK implementation is from https://github.com/dholzmueller/bmdal_reg

iccv23-kecor-active-3ddet's People

Contributors

luoyadan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.