Code Monkey home page Code Monkey logo

bounding-box-prediction-rl's Introduction

bounding-box-prediction-RL

Bounding box prediction in object tracking using Reinforcement Learning. The model utilizes reinforcement learning to choose the best region for a given frame in the video. The model uses a Deep Q-network (DQN) to learn a state-action value function over numerous transitions with epsilon-greedy policy. From learning the optimal value function, we indirectly converge to an optimal policy. The optimal policy will allow the model to select the region that maximizes the value function for each given current state.

The model is designed to learn from video frames and bounding box annotation from the top-down view video captured from an entrance of a parking structure. This model can be extended to other videos of similar application, but the input structure needs be prepared.

The repository contains code to train and test reinforcement learning in Keras with TensorFlow backend.

Requirements

The code in the repository requires Keras 2.0.5 with Tensorflow 1.0, using Python 2.7 along with the following python libaries:

  • numpy
  • pandas
  • matplotlib
  • cv2

The python modules can be installed using: 'pip install numpy pandas matplotlib opencv-python'.

Getting Started

First you will need to copy the project and download the necessary weights in your local machine.

Prerequisites

  1. Clone the repository: git clone https://github.com/DonovanLo/bounding-box-prediction-RL.git
  2. Download the VGG16 weights: https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5

The VGG16 weights will be placed in the same directory of the python scripts. The VGG model is used to create image descriptor for the DQN.

Running the tests

To run the test, perform the following.

Run: python testing.py -w=./model_qn_weights/model_epoch_19_e010_h5 -t=./test_images/ -gt=./test_images/PS12_1_7_gt.txt or python testing.py to create the output visualization.

The test file will loop over the number of cars found in the annotation. For each car in the loop, the script will feed the initial bounding box to the model. The model will iterate through a fixed number of steps/attempts and apply the optimal action to reach the target in the image.

Each car will have a separate output. The output shows the number of actions and the result from taking the action at a particular state.

Running the training

To run the training, perform the following.

Run: python training.py -t=../PS12_1_7_frames/ -gt=../PS12_1_7_gt.txt to train the model.

The training will need two items:

  1. Extracted frames for the video
  2. Ground-truth text file. The text file consists of bounding boxes around the targets using the VATIC tool.

With NVIDIA GTX 1080 Ti, it took 2 days to perform 19 epochs.

Authors

Acknowledgments

bounding-box-prediction-rl's People

Contributors

donovanlo avatar shark8dude avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.