Code Monkey home page Code Monkey logo

objectdetectionreview's Introduction

A Review on Anchor Assignment and Sampling Heuristics in Deep Learning-based Object Detection

This repository provides a up-to-date paper list about anchor assigment, sampling heuristics and recent trends in object detection. This repository based on a problem-based taxonomy in the following paper: "A Review on Anchor Assignment and Sampling Heuristics in Deep Learning-based Object Detection"- [Paper]

How to add new papers to this repository

If you find a new paper that relates to anchor assignment, sampling methods as well as new trends in object detection. Please feel free to make a pull request.

News

Table of Contents

  1. Anchor Assignment Methods
    1.1 Hard Anchor Assignment
    1.2 Soft Anchor Assignment
  2. Sampling Methods
    2.1 Hard Sampling
    2.2 Soft Sampling
  3. Recent Trends in Object Detection
    3.1 Transformer-based Detection Head
    3.2 Transformer-based Feature Extractor

1. Anchor Assignment Methods

1.1. Hard Anchor Assignment

  • Focal Loss for Dense Object Detection, ICCV 2017. [Paper]
  • FCOS: Fully Convolutional One-Stage Object Detection, ICCV 2019. [Paper]
  • Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection, CVPR 2020. [Paper]
  • You Only Look One-level Feature, CVPR 2021. [Paper]

1.2. Soft Anchor Assignment

  • FreeAnchor: Learning to Match Anchors for Visual Object Detection, NeurIPS 2019. [Paper]
  • Learning from Noisy Anchors for One-stage Object Detection, CVPR 2020. [Paper]
  • Multiple Anchor Learning for Visual Object Detection, CVPR 2020. [Paper]
  • AutoAssign: Differentiable Label Assignment for Dense Object Detection, arXiv 2020. [Paper]
  • Probabilistic Anchor Assignment with IoU Prediction for Object Detection, ECCV 2020. [Paper]
  • End-to-End Object Detection with Transformers, ECCV 2020. [Paper]
  • End-to-End Object Detection with Fully Convolutional Network, CVPR 2021. [Paper]
  • LLA: Loss-aware label assignment for dense pedestrian detection, Neurocomputing 2021. [Paper]
  • OTA: Optimal Transport Assignment for Object Detection, CVPR 2021. [Paper]
  • What Makes for End-to-End Object Detection?, ICML 2021. [Paper]
  • YOLOX: Exceeding YOLO Series in 2021, arXiv 2021. [Paper]
  • TOOD: Task-aligned One-stage Object Detection, ICCV 2021. [Paper]
  • Mutual Supervision for Dense Object Detection, ICCV 2021. [Paper]
  • Improving Object Detection by Label Assignment Distillation, WACV 2022. [Paper]
  • A Dual Weighting Label Assignment Scheme for Object Detection, CVPR 2022. [Paper]
  • ObjectBox: From Centers to Boxes for Anchor-Free Object Detection, ECCV 2022. [Paper]

2. Sampling Methods

2.1. Hard Sampling

  • Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014. [Paper]
  • Fast R-CNN, ICCV 2015. [Paper]
  • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NeurIPS 2015. [Paper]
  • SSD: Single Shot MultiBox Detector, ECCV 2016. [Paper]
  • Training Region-based Object Detectors with Online Hard Example Mining, CVPR 2016. [Paper]
  • Libra R-CNN: Towards Balanced Learning for Object Detection, CVPR 2019. [Paper]
  • Overlap Sampler for Region-Based Object Detection, WACV 2020. [Paper]

2.2. Soft Sampling

  • Focal Loss for Dense Object Detection, ICCV 2017. [Paper]
  • Is Heuristic Sampling Necessary in Training Deep Object Detectors?, TIP 2021. [Paper]
  • Gradient Harmonized Single-stage Detector, AAAI 2019. [Paper]
  • Prime Sample Attention in Object Detection, CVPR 2020. [Paper]
  • Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection, NeurIPS 2020. [Paper]
  • Learning a Unified Sample Weighting Network for Object Detection, CVPR 2020. [Paper]
  • Equalization Loss for Long-Tailed Object Recognition, CVPR 2020. [Paper]
  • Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection, CVPR 2021. [Paper]
  • VarifocalNet: An IoU-aware Dense Object Detector, CVPR 2021. [Paper]

3. Recent Trends in Object Detection

3.1. Transformer-based Detection Head

  • End-to-End Object Detection with Transformers, ECCV 2020. [Paper]
  • Deformable DETR: Deformable Transformers for End-to-End Object Detection, ICML 2021. [Paper]
  • End-to-End Object Detection with Adaptive Clustering Transformer, BMVC 2021. [Paper]
  • Fast Convergence of DETR with Spatially Modulated Co-Attention, ICCV 2021. [Paper]
  • Conditional DETR for Fast Training Convergence, ICCV 2021. [Paper]
  • PnP-DETR: Towards Efficient Visual Analysis with Transformers, ICCV 2021. [Paper]
  • Dynamic DETR: End-to-End Object Detection with Dynamic Attention, ICCV 2021. [Paper]
  • Rethinking Transformer-based Set Prediction for Object Detection, ICCV 2021. [Paper]
  • WB-DETR: Transformer-Based Detector Without Backbone, ICCV 2021. [Paper]
  • UP-DETR: Unsupervised Pre-training for Object Detection with Transformers, ICCV 2021. [Paper]
  • Efficient DETR: Improving End-to-End Object Detector with Dense Prior, arXiv 2021. [Paper]
  • ViDT: An Efficient and Effective Fully Transformer-based Object Detector, ICLR 2022. [Paper]
  • Anchor DETR: Query Design for Transformer-Based Detector, AAAI 2022. [Paper]
  • You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection, NeurIPS 2021. [Paper]
  • Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity, ICLR 2022. [Paper]
  • Accelerating DETR Convergence via Semantic-Aligned Matching, CVPR 2022. [Paper]
  • DN-DETR: Accelerate DETR Training by Introducing Query DeNoising, CVPR 2022. [Paper]
  • DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection, arXiv 2022. [Paper]
  • AdaMixer: A Fast-Converging Query-Based Object Detector, CVPR 2022. [Paper]
  • DETRs with Hybrid Matching, arXiv 2022. [Paper]

3.2. Transformer-based Feature Extractor

  • Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions, ICCV 2021. [Paper]
  • PVTv2: Improved Baselines with Pyramid Vision Transformer, CVMJ 2022. [Paper]
  • Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, ICCV 2021. [Paper]
  • Swin Transformer V2: Scaling Up Capacity and Resolution, arXiv 2021. [Paper]
  • Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding, ICCV 2021. [Paper]
  • Focal Self-attention for Local-Global Interactions in Vision Transformers, NeurIPS 2021. [Paper]
  • CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows, CVPR 2022. [Paper]
  • MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer, ICLR 2022. [Paper]
  • Twins: Revisiting the Design of Spatial Attention in Vision Transformers, NeurIPS 2021. [Paper]
  • ResT: An Efficient Transformer for Visual Recognition, NeurIPS 2021. [Paper]
  • Transformer in Transformer, NeurIPS 2021. [Paper]
  • Quadtree Attention for Vision Transformer, ICLR 2022. [Paper]
  • MPViT: Multi-Path Vision Transformer for Dense Prediction, CVPR 2022. [Paper]

Contact

If you have any question, please contact Xuan-Thuy Vo, email: [email protected] or cite my paper:

@article{VO2022,
title = {A Review on Anchor Assignment and Sampling Heuristics in Deep Learning-based Object Detection},
journal = {Neurocomputing},
year = {2022},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2022.07.003},
url = {https://www.sciencedirect.com/science/article/pii/S092523122200861X},
author = {Xuan-Thuy Vo and Kang-Hyun Jo}
}

objectdetectionreview's People

Contributors

voxuanthuy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.