Code Monkey home page Code Monkey logo

hand-segmentation-in-the-wild's Introduction

Analysis of Hand Segmentation in the Wild

Abstract

A large number of works in egocentric vision have concentrated on action and object recognition. Detection and segmentation of hands in first-person videos, however, has less been explored. For many applications in this domain, it is necessary to accurately segment not only hands of the camera wearer but also the hands of others with whom he is interacting. Here, we take an in-depth look at the hand segmentation problem. In the quest for robust hand segmentation methods, we evaluated the performance of the state of the art semantic segmentation methods, off the shelf and fine-tuned, on existing datasets. We fine-tune RefineNet, a leading semantic segmentation method, for hand segmentation and find that it does much better than the best contenders. Existing hand segmentation datasets are collected in the laboratory settings. To overcome this limitation, we contribute by collecting two new datasets: a) EgoYouTubeHands including egocentric videos containing hands in the wild, and b) HandOverFace to analyze the performance of our models in presence of similar appearance occlusions. We further explore whether conditional random fields can help refine generated hand segmentations. To demonstrate the benefit of accurate hand maps, we train a CNN for hand-based activity recognition and achieve higher accuracy when a CNN was trained using hand maps produced by the fine-tuned RefineNet. Finally, we annotate a subset of the EgoHands dataset for fine-grained action recognition and show that an accuracy of 58.6% can be achieved by just looking at a single hand pose which is much better than the chance level (12.5%).

Code

We have uploaded the additional files needed to train, test and evaluate our models' performance. Code for multiscale evaluation is also provided. See the folder refinenet_files.

To test the models:

  • you will need to download the refinenet code from their github repository.
  • Copy the files provided in refinenet_files folder to refinenet/main folder.
  • Place the refinenet-based hand segmentation model (see Models section) in refinenet/model_trained folder.
  • For instance, to test the model trained on EgoHands dataset, copy the refinenet_res101_egohands.mat file in refinenet/model_trained folder. Set the path to test images folder in demo_refinenet_test_example_egohands.m and run the script.
  • The demo code is the same from the original refinenet demo files except minor changes.

Models

You can download our refinenet-based hand segmentation models using the links given below:

Datasets

We used 4 hand segmentation datasets in our work, two of them(EgoYouTubeHands and HandOverFace datasets) are collected as part of our contribution:

Warning!

Thanks to Rafael Redondo Tejedor who pointed out some minor mistakes in the dataset:

  • For HandOverFace dataset, 216.jpg and 221.jpg images are actually GIFs in the original size folder.
  • There were minor annotations errors for the following images in xml files: 10.jpg and 225.jpg which were pointed out and corrected by Rafael Redondo Tejedor.
  • Current link to the dataset has updated xml files for the above mentioned annotation errors.

NEW!

Links to the videos used for EYTH dataset are given below. Each video is 3-6 minutes long. We cleaned the dataset before annotation and discarded unnecessary frames (e.g., frames containing text or if hands were out of view for a long time, etc).

vid4 vid6 vid9

NEW!

Test set for HandOverFace dataset is uploaded here.

Example images from EgoYouTubeHands dataset: EYTH

Example images from HandOverFace dataset: HOF

  • EgoHands+ dataset: To study fine-level action recognition, we provide additional annotations for a subset of EgoHands dataset. You can find more details here and download the dataset from this download link.

Results

Qualitative Results

Hand segmentation results for all datasets: All datasets:

CVPR Poster

cvpr-poster

Acknowledgements

We would like to thank undergraduate students Cristopher Matos, and Jose-Valentin Sera-Josef, and MS student Shiven Goyal for helping us in data annotations.

Citation

If this work and/or datasets is useful for your research, please cite our paper.

Questions?

Please contact '[email protected]'

hand-segmentation-in-the-wild's People

Contributors

aurooj avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.