Code Monkey home page Code Monkey logo

hand-tracking-pytorch's Introduction

Hand Tracking Pytorch

Hand tracking application developed in Pytorch.

Hand tracking application developed for an university project. It used a RCNN (Regions with CNN) to track hands inside the image and a CNN (Convolutional Neural Network) to evaluate the number of the fingers in the hand.

The RCNN training script is based on the script at tutorial in the Pytorch site. I use a pretrained RCNN on COCO dataset and with fine tuning I teach the network how to recognize hands. The original tutorial used a Mask-RCNN that also create a mask of the shape of the things you search. Since I was only interested in a square box aruond hand I modify the script to use a simpler RCNN.

To test the track in real time you also need OpenCV.

A simple video demo is available at https://www.youtube.com/watch?v=1nmR-dRdeos.

RCNN Training

I try two different dataset to train the network. The firs is the pretty famous EgoHands Dataset (The 1.3 Gb version) and the other is a my personal (costum) dataset that I create. The results are similar and you could choose which one to use. The trained model provided in the link is trained on my personal dataset.

Costum dataset use and creation (RCNN (my dataset) folder)

To create your own dataset used the file inside the RCNN (my dataset) folder. The files are well commented but basically you need to:

  1. Execute the dataset_creator.py (you need OpenCV for this). This script create a windows with the image capture by your camera and n boxes (n set by you). You position the hand inside the boxes and after x seconds (x decide by you) the image and the boxes position are saved.
  2. Check the dataset with dataset_check.py (OPTIONAL).
  3. Train the network with train_RCNN.py .

The MyDataset.py file contain the Dataset class used during training. This class is an extension of the Pytorch Dataset class and it will used in combination with the Dataloader (more info and example here, here, here and here).

The support_function.py contain function used during the dataset creation and the dataset loading.

NB: The train need torchvision. All the file are inside the vision folder and that folder must be in the same folder of train_RCNN.py.

EgoHands dataset use (RCNN (egos hand dataset) folder)

To train the model with the EgoHands dataset download the dataset from the link (The 1.3 Gb version). Extract it and put the folders inside the _LABELLED_SAMPLES in tha path that you will use to contain the file for the training. After that execute the file train_RCNN.py .

NB: The train need torchvision. All the file are inside the vision folder and that folder must be in the same folder of train_RCNN.py.

CNN Training (CNN (Finger Counter) folder)

The CNN is a simple network trained on a costum dataset. The dataset is created with the file dataset_creation.py in a similar way to the costum dataset for the RCNN. I use an Adadelta optimizer with a Cross Entropy loss function. For some reason sometimes the training succed and sometimes remain stuck and don't learn after hundreds of epoch. In the last case stop and restart the training.

I use 140 x 140 square as size of the image but you could choose the size that you prefer.

The trained network provided with the code doesn't work very well since I train its on 200 example. Also I suggest you to use always the same hand to improve the accurancy. A trick later used in the the final script is to mirror one of the two hand find by RCNN so the classifier will only receive one type of hand and give similar results for both hand.

Hand tracking script

The hand_detector_V2.py contain the script that track the hand(s). It use OpenCV to capture the camera image and feed that image to the RCNN. With the RCNN I find the hand, take only that portion of the image, reshape in a 140 x 140 square and feed them into the fingers counter.

NB: You need a GPU with CUDA to obtain a (more or less) smoot frame rate. I use a GTX 2070 but probably even less powerfull GPU work. To improve the FPS I use a little trick: instead of feed every frame into the RCNN I do it every 3 frame. In this way the code works smoothly granted that you had a GPU capable of executing the code required by the RCNN in relative small time.

TODO List

  • Add video explanation.
  • Modify the costum dataset to label differently left hand and right and.
  • Retrain the RCNN to allow recognition of left and right hand.

Contacs

If you ever use this code please cite me ๐Ÿ™ โค๏ธ

If you need to contact my you could write me at [email protected] or [email protected] .

hand-tracking-pytorch's People

Contributors

jesus-333 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

hand-tracking-pytorch's Issues

What's the pytorch version for your pre-trained model?

What's the pytorch version for your pre-trained model? My torch version is 1.7.1 but something abour pytorch goes wrong when I ran the script hand_detector_V2.py. Could you please add some information about your environment? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.