Code Monkey home page Code Monkey logo

d-rmm's Introduction

[ICML 2023] End-to-End Multi-Object Detection with a Regularized Mixture Model


End-to-End Multi-Object Detection with a Regularized Mixture Model
Jaeyoung Yoo*, Hojun Lee*, Seunghyeon Seo , Inseop Chung, Nojun Kwak
NAVER WEBTOON AI, Seoul National University
* equal contribution

Pytorch implementation for the ICML 2023 paper: End-to-end Multi-Object Detection with a Regularized Mixture Model.
This paper aims to reduce the heuristics of the training process (Figure 1) and improve the reliability of the predicted confidence score (Figure 2).

Exp


Requirements

The codes are tested in the following environment:

  • python 3.8
  • pytorch 1.10
  • CUDA 11.3
  • mmdet 2.12.0
  • mmcv-full 1.3.17

Data Preparation

exist_data_model.md

D-RMM
├── mmdet
├── tools
├── configs
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017

Installation

get_started.md

pip install torch==1.10.1+cu113 torchvision==0.11.2+cu113  -f https://download.pytorch.org/whl/cu113/torch_stable.html
pip install mmcv-full==1.3.17 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html
python setup.py build develop

Training & Test

bash train.sh
bash test.sh

Performances

Sparse R-CNN

Method Backbone AP val2017 AP test-dev Config Link
SRCNN R50 45.0 45.2 config mmdet (reproduced)
SRCNN R101 46.4 46.4 config mmdet (reproduced)
SRCNN Swin-Tiny 47.4 - - -
D-RMM + SRCNN R50 47.0 (+2.0) 46.9 config Google Drive
D-RMM + SRCNN R101 48.0 (+1.6) 48.2 config Google Drive
D-RMM + SRCNN Swin-Tiny 49.9 (+2.5) - TBA TBA

AdaMixer

Method Backbone AP val2017 AP test-dev Config Link
AdaMixer R50 47.0 46.9 Github Github
AdaMixer R101 48.0 48.2 Github Github
AdaMixer Swin-Tiny 48.9 - - -
D-RMM + AdaMixer R50 48.4 (+1.4) 48.7 (+1.8) TBA TBA
D-RMM + AdaMixer R101 49.2 (+1.2) 49.6 (+1.4) TBA TBA
D-RMM + AdaMixer Swin-Tiny 50.7 (+1.8) - TBA TBA

Citation

If you find this work or code useful for your research, please use the following BibTex entry:

@misc{yoo2023endtoend,
      title={End-to-End Multi-Object Detection with a Regularized Mixture Model}, 
      author={Jaeyoung Yoo and Hojun Lee and Seunghyeon Seo and Inseop Chung and Nojun Kwak},
      year={2023},
      eprint={2205.08714},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgment

License

This project is released under the Apache 2.0 license. See LICENSE for the full license text.

D-RMM

Copyright 2022-present NAVER WEBTOON

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


d-rmm's People

Contributors

lhj815 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

d-rmm's Issues

how to compute NLL without matching?

Hi @lhj815 ,

Thank you for sharing the code of your interesting work. I enjoyed reading it.

I have a small question about the 1-to-1 mapping between the prediction and GT box to compute the NLL loss. In the code it looks to me that there exists a 1-to-1 mapping, however it is not clear to me how we obtain it?

I would appreciate any pointer/code reference. Thanks in advance!

Regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.