Code Monkey home page Code Monkey logo

synthetic_people's Introduction

Enriching Diversity of Synthetic images for improving camera based patient monitoring.

banner

Overview

This repository contains code for generating synthetic people in diverse unseen poses. The resulting dataset is used as training data for developing better objection detection networks.

Abstract

Camera-based patient monitoring is undergoing rapid adoption in the healthcare sector with the recent COVID- 19 pandemic acting as a catalyst. It offers round-the-clock monitoring of patients in clinical units (e.g. ICUs, ORs), or at their homes through installed cameras, enabling timely, pre-emptive care. These are powered by Computer Vision based algorithms that pick up critical physiological data, patient activity, sleep pattern, etc., enabling real-time, pre-emptive care. In this work, we develop a person detector to deploy in such scenarios. These algorithms require huge quantities of training data which is often in shortage in the healthcare field due to stringent privacy norms. Therefore looking for solutions to enrich clinical data becomes necessary. An alternative currently popular among the Computer Vision community is to use synthetic data for training, created using 3D modeling software pipelines. However, this type of technique often has limitations in data diversity and data balancing as desired variations need to be provided explicitly. In this thesis, we propose a data augmentation method for enriching diversity in synthetic data without using any additional external data or software. In particular, we introduce a pose augmentation technique, which synthesizes new human characters in poses unseen in the original dataset using Pose-Warp GAN. Additionally, a new metric is proposed to assess diversity in human pose datasets. The proposed method of augmentation is evaluated using YOLOv3. We show that our pose augmentation technique significantly improves person detection performance compared to traditional data augmentation, especially in low data regimes.

The following picture gives the overall objective. objective

Methodology

To reproduce the results of this work, follow the steps mentioned below. The image below is provided for reference.

meth_full

Step 1: Preparing data

For this work, we used the SURREAL dataset as the dataset to augment. This implementation is not limited to just synthetic characters. Feel free to use any other "people" dataset that suits your application.

  • Run extract_frames.py to extract image frames from video files in SURREAL dataset. The script saves RGB images, GT bounding boxes and pose keypoint files. Set appropriates paths and frames/video.
  • Run gaussian.py inside the gaussian directory. Choose suitable no of images, paths and other variables in params.py. gaussian.py converts bounding box from SURREAL format to the format acceptable in Pose-Warp GAN. It also converts bboxs to the format given in YOLOv3 format.
  • Download images to be used as background from Google Open Images. The download section of the site provides instructions. Better, use this repository to for easier download.
  • Download 2000 images and corresponding annotations from MPII human pose dataset.

Step 2: Augment poses

Run cells in aug_plots.ipynb notebook to generate new poses from existing poses. The picture below shows some examples of generated poses using this method.

good_poses

The picture below shows existing poses in the source dataset and newly generated poses overlapped on a PCA projected 2D space.

pca

Step 3: Generate people

  • Run main.py inside pose_warp directory to generate people depicting those poses that were generated in the previous step. The script takes as input 1)RGB images of people with backgroung 2)Required pose. The output is a new character depicting the conditioned pose along with corresponding mask file. Edit params.py to change paths and other parameters. The codes are based on the repository Pose-Warp GAN. The picture below shows some example outputs.

good_ones

  • Run cells in gopen.ipynb for pasting newly generated people on images from Google Open Images as backgrounds. The cell under comment #SINGLE TRAIN procduces images having 1 person per image and the cell under comment #MULTI TRAIN procduces images having multiple people per image. The script also generates bbox annotations in YOLOv3 format. Select required number of images to be generated. Augmented images produced in this manner serves as the training data for the next step. Some example images are shown below:

paste

Step 4: Training and Validation

All training and testing scripts are based from this YOLOv3 repository. For pre-training YOLO network with synthetic augmented data, run the script run_train.sh. Edit cfg.py for adjusting paths and other parameters. For training with real data from MPII dataset, replace finetune.py with train.py in run_train.sh. Test your trained YOLOv3 model by running the script run_yolov3.sh. Select appropriate paths and datasets to test on by editing cfg.py. To obtain test results on objection detection metrics(Precesion, Recall, F1 score, mAP), run evaluation.py. To obtain pretty Precesion vs Recall curves for different models, run scores_all.py followed by create_graphs.py. Some results of object detection on MPII human pose dataset is shown below:

detections

Picture below shows the result of augmenting training data with multi annotation images of synthetic people regarding person detection performance.

multi

Acknowledgements

This work has adapted code from:

synthetic_people's People

Contributors

prchinmay avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.