Code Monkey home page Code Monkey logo

objectpermanence's Introduction

Learning Object Permanence from Video

Code for our paper: *Shamsian, *Kleinfeld, Globerson & Chechik, "Learning Object Permanence from Video"

Link to our paper

Installation

Code and Data

  1. Download or clone the code in this repository
  2. cd to the project directory
  3. Please visit our project website to download our datasets and pretrained models

Conda Environment

Quick installation using Conda:

conda env create -f environment.yml

Run using Docker

  1. Download and unzip the datasets and the trained models. Save the datasets in a folder named data
  2. Run the container using the commend docker run -it --rm -v <absolute_path_to_the_project_directory>/trained_models:/ObjectPermanence/trained_models -v <absolute_path_to_the_project_directory>/data:/ObjectPermanence/data --name op ofri64/object_permanence:latest The project image will be downloaded from docker hub when the command is executed for the first time
  3. If you want to utilize available GPU resources within the container use the nvidia-docker command instead of the docker command when running the container in step 3

Directory Structure

directory file description
./ main.py Main entry point for running experiments and analyzing their results
baselines/ * Contains the implementation of OPNet and other baseline variants
baselines/DaSiamRPN/ * Contains code cloned from the DaSiamRPN model repository
baselines/ learned_models.py Contains the implementation of OPNet and 4 learned baselines
baselines/ programmed_models.py Contains the implementation of the programmed baselines
generate/ * Contains code used to generate new dataset samples and annotations. Built on top of the CATER code base.
object_detection/ * Contains code used for fine tunning an object detection model. Base on the following Pytorch tutorial
configs/ * Contains model, training and inference configuration files

Execute OP Experiments

The main.py file supports 4 operation modes - training, inference, results analysis and data preprocessing. You can run python main.py --help to view the available operation modes. Also, executing python main.py <operation_mode> --help will provide details about the required arguments for executing the chosen operation mode.

Training

For training one of the available models, run the command python main.py training with the following arguments:

--model_type <name of the model to use>
--model_config <path to a json file containing model configuration>
--training_config <path to a json file containing training configuration>

The currently supported model types are:

  1. opent
  2. opnet_lstm_mlp
  3. baseline_lstm
  4. non_linear_lstm
  5. transformer_lstm

Model configuration files and an example training configuration file are provided in the configs directory. For running experiment on the "learning from only visible frames" setup (Section 7.2 in the paper) just use the prefix "no_labels" after the model name. for example opnet_no_labels.

Inference

For using a trained model to perform inference on an unlabeld data, run the command python main.py inference with the following arguments:

--model_type <name of the model to use>
--results_dir <path to a directory to save result predictions>
--model_config <path to a json file containing model configuration>
--inference_config <path to a json file containing inference configuration>

An example inference config file is provided in the configs directory

Results Analysis

For analyzing results received after invoking the inference command, run python main.py analysis with the following required arguments:

--predictions_dir <path to a directory containing inference results>
--labels_dir <path to a directory containing label annotations>
--output_file <csv file name for the analyzed results output>

Various frame level annotation files can also be supplied as arguments when running python main.py analysis. Following is an example of a bash command to analyze results using detailed frame level annotations

$ python main.py analysis
--predictions_dir test_results/opnet/
--labels_dir test_data/labels
--containment_annotations test_data/containment_and_occlusions/containment_annotations.txt
--containment_only_static_annotations test_data/containment_and_occlusions/containment_only_static_annotations.txt
--containment_with_movements_annotations test_data/containment_and_occlusions/containment_with_move_annotations.txt
--visibility_ratio_gt_0 test_data/containment_and_occlusions/visibility_rate_gt_0.txt
--visibility_ratio_gt_30 test_data/containment_and_occlusions/visibility_rate_gt_30.txt
--iou_thresholds 0.5,0.9
--output_file results.csv

Data Preprocessing

OPNet and the other baseline models receive as input the localization annotations (bounding boxes) of all the visible objects in each video frame. This input is the results of running an object detection model on raw video frames. For running an object detection and generating visible object annotations run python main.py preprocess with the following arguments:

--results_dir <path to a directory to save result annotations>
--config <path to a json configuration file>

An example preprocess config file is provided in the configs directory

Inference according to the CATER setup (snitch localization task)

For using a trained model to perform inference according to the snitch localization task defined in CATER, run the command python main.py cater_inferece with the following arguments:

--results_dir <path to a directory to save result predictions>
--model_config <path to a json file containing model configuration>
--inference_config <path to a json file containing inference configuration>

The command will output a csv file containing the 6x6 grid class predictions for each video in the provided dataset.
The inference config file should have the same structure as the config file used in the "inference" mode. This mode only supports the "opnet" model, Thus there is no need to define the model type (unlike in the original "inference" mode) and the "model_config" parameter should match the path to the opnet model config file.

Cite our paper

If you use this code, please cite our paper.

@article{shamsian2020learning,
  title={Learning Object Permanence from Video},
  author={Shamsian, Aviv and Kleinfeld, Ofri and Globerson, Amir and Chechik, Gal},
  journal={arXiv preprint arXiv:2003.10469},
  year={2020}
}

objectpermanence's People

Contributors

ofrikleinfeld avatar avivsham avatar maskaravivek avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.