Code Monkey home page Code Monkey logo

pcan's Introduction

PWC

PCAN for Multiple Object Tracking and Segmentation

This is the offical implementation of paper PCAN for MOTS.

We also present a trailer that consists of method illustrations and tracking & segmentation visualizations. Our project website contains more information: vis.xyz/pub/pcan. Code is under organization.

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
NeurIPS 2021, Spotlight
Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

Abstract

Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes. Most approaches only exploit the temporal dimension to address the association problem, while relying on single frame predictions for the segmentation mask itself. We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation. PCAN first distills a space-time memory into a set of prototypes and then employs cross-attention to retrieve rich information from the past frames. To segment each object, PCAN adopts a prototypical appearance module to learn a set of contrastive foreground and background prototypes, which are then propagated over time. Extensive experiments demonstrate that PCAN outperforms current video instance tracking and segmentation competition winners on both Youtube-VIS and BDD100K datasets, and shows efficacy to both one-stage and two-stage segmentation frameworks.

Prototypical Cross-Attention Networks (PCAN)

Main results

Detector mMOTSA-val mIDF1-val ID Sw.-val Scores-val mMOTSA-test mIDF1-test ID Sw.-test Scores-test Config Weights Preds Visuals
ResNet-50 28.1 45.4 874 scores 31.9 50.4 845 scores config model | MD5 preds visuals

Installation

Please refer to INSTALL.md for installation instructions.

Usages

Please refer to GET_STARTED.md for dataset preparation and detailed running (training, testing, visualization, etc.) instructions.

Citation

If you find PCAN useful in your research or refer to the provided baseline results, please star โญ this repository and consider citing ๐Ÿ“:

@inproceedings{pcan,
    author    = {Ke, Lei and Li, Xia and Danelljan, Martin and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
    booktitle = {Advances in Neural Information Processing Systems},
    title     = {Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation},
    year      = {2021}
}

pcan's People

Contributors

lkeab avatar fyu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.