Code Monkey home page Code Monkey logo

itssiddharth / human_activity_recognition Goto Github PK

View Code? Open in Web Editor NEW
32.0 2.0 7.0 94.77 MB

A new and computationally cheap method to perform human activity recognition using PoseNet and LSTM. Where we use PoseNet for Preprocessing and LSTM for understand the sequence.

License: MIT License

Python 13.28% Jupyter Notebook 86.72%
human-activity-recognition har video-action-recognition time-series lstm-neural-networks deep-neural-networks lite

human_activity_recognition's Introduction

Lite Model For Human Activity Recognition (HAR)

MIT license PRs Welcome

Content

  1. Example
  2. Overview
  3. Architecture
  4. Usage

Example

Let's look at 2 examples

  1. Here is an example of recognising a person playing a guitar.

2. Another example where its recognition of wrestling.

Overview

Most of the HAR models out there are just too heavy and cannot be deployed on low power hardware like Raspberry Pi, Jetson Nano etc. Even in Laptops the inference time is very high and causes a lot of lag. This model efficiently solves this problem,

A Binary HAR classifier that can be trained and deployed in less than 10 lines of code.

Architecture

As this is a time series problem using an LSTM was an apparent choice.

The LSTM had to be taught the relative motion of body joints for a certain action.

Preprocessing

In preprocessing I have inferred Posenet (TF-lite model) using tf.Interpreter().

  1. Posenet returns a HeatMap and an OffsetMap.
  2. Using this we extract the location of the 17 Keypoints/Body Joints that posenet detects.
  3. If a certain joint is not in frame it is assigned [0,0].
  4. From every video we sample 9 frames that are equally spaced.
  5. Now each frame of the 9 frames in each video will contain 17 lists of [x,y].
  6. This implies 34 points in each frame.
  7. Using the path to the dataset we can make a .csv file which is essentially our training data.

Model

  1. There are 'x' no of videos each having 9 frames which in turn have 34 points.
  2. So input shape for first LSTM becomes (9,34).
  3. The model is a 3 layer LSTM with 128 nodes each, along with Dropouts of 0.2 and 0.1 and a batch normalization.
  4. Which is followed by a Dense Layer of 32 nodes and an Output Layer of 1 node.

Inference

  1. During inference we use Posenet again for preprocessing and pandas takes care of the rest.
  2. Load in the model and pass on the preprocessed data to the model.
  3. The model will make a binary classification based on the 2 labels you trained it on.

Usage

Infering the example models given

  1. Use the script testing_model.py.
  2. There are 2 models in the repo mwrestling_vs_guitar.model and guitar_vs_yoga.model.
  3. In the below line insert the name of the model you want to infer.
>> model = tf.keras.models.load_model('<model_name>')
  1. Once you have the video offline (or) a stack of 9 frames in your live inference. Pass it on to the function preprocess_video. Please have it ready in mp4 format.
>> X_test = preprocess_video('< Path to Video >')
  1. Once these 2 line of the code are edited. You can run the code an Obtain the prediction.

Training custon classifier.

  1. Use the script train.py.
  • NOTE : Ensure all training examples are 10 seconds long. i.e., use kinetics dataset.
  1. Now using the function generate_training_data we can essentially make out training data. The function takes 2 parameters Path to videos and Name of the csv you want to generate
>> generate_training_data('<Path_to_folder_containing_videos>', '<name_of_csv>')
  1. Now that both the csv's are generated. Use the preprocess_csv function to get your training_array. The function takes 2 parameters Path to CSV and No of samples in validation set.
>> X_action2_train, X_action2_test =  preprocessing_csv('<name_of_csv>', no_of_elements_in_validation_set)
  1. This returns the test and training split.
  2. Now use the get_final_data_for_model.
>> X_train, X_test, Y_train, Y_test = get_final_data_for_model(X_action2_train, X_action2_test, X_action1_train, X_action1_test)
  1. Use the shuffler function to shuffle the data.
>> X_train, Y_train = shuffler(X_train, Y_train)
  1. All the work is done. Time to train the model.
>> train_LSTM(X_train, Y_train, X_test, Y_test, name_of_the_model)

Here we give in the X's and Y's along with the NAME by which you want the model to be saved.

  1. You can checkout how to infer it in the previous section.

LICENSE

MIT

human_activity_recognition's People

Contributors

itssiddharth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.