Code Monkey home page Code Monkey logo

emashkin / imaterialist2020-image-segmentation-on-detectron2 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from julienbeaulieu/imaterialist2020-image-segmentation-on-detectron2

0.0 0.0 0.0 7.08 MB

In this competition we are tasked to do instance segmentation with attribute localization (recognize one or multiple attributes for the instances) on a fashion and apparel dataset. We customize Detectron2 to handle this new task.

Python 0.45% Jupyter Notebook 99.55%

imaterialist2020-image-segmentation-on-detectron2's Introduction

iMaterialist 2020 Kaggle Competition in Detectron2

In this competition we are tasked to do instance segmentation as well as attribute localization (recognize one or multiple attributes for the instances) on a fashion and apparel dataset. Here is the link to competition.

Model and Training

To solve the challenging problems entailed in this task we use and extend Detectron2โ€™s MaskRCNN architecture and added a new attribute head as shown in orange below.

  • In prior steps in the MaskRCNN architecture we leverage a ResNet-50 with a feature pyramid network (FPN) as backbone.
  • The input image is resized to 1300 of the longer edge to feed the network.
  • Random horizontal flipping was applied during the training.
  • The model was trained on top of pre-trained COCO dataset weights for 300,000 iterations.

Kaggle Submission

The submission to Kaggle required specific encoding (run length encoding - RLE) for all the predicted masks in order to reduce the size of the submitted file. This posed a number of challenges since RLE is not standardized amongst COCO, Detectron2 and Kaggle. Also, Kaggle required that each pixel of the masks do not overlap, so mask refining was required.

Evaluation

Submissions are evaluated on the mean average precision at two different thresholds.

  1. IoU: intersection over union (IoU) thresholds. The IoU of a proposed set of object pixels and a set of true object pixels is calculated as:

  1. F1: f1 score between a set of predicted attributes and a set of true attributes of one segmentation mask

The metric sweeps over a range of IoU thresholds and F1 thresholds, at each point calculating an average precision value. The threshold values range from 0.5 to 0.95 with a step size of 0.05: (0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95). In other words, at an IoU threshold of 0.5 and an F1 threshold of 0.5, a predicted object is considered a "hit" if it satisfies the following conditions:

  1. Its intersection over union with a ground truth object is greater than 0.5
  2. If the ground truth object has attributes, the f1 scores of predicted attributes and ground-truth attributes is greater than 0.5. At each threshold pair, t=(ti, tf), a precision value is calculated based on the number of true positives (TP), false negatives (FN), and false positives (FP) resulting from comparing the predicted object to all ground truth objects:

Category and Attributes Analysis

There are 46 apparel categories and 294 attributes presented in the Fashionpedia dataset. On average, each image was annotated with 7.3 instances, 5.4 categories, and 16.7 attributes. Of all the masks with categories and attributes, each mask has 3.7 attributes on average (max 14 attributes).

Docker

A Docker image is available at https://hub.docker.com/r/cvnnig/detectron2.

WIP

This repo is still being cleaned and organized.

Authors

Julien Beaulieu, Yang Ding

imaterialist2020-image-segmentation-on-detectron2's People

Contributors

julienbeaulieu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.