Code Monkey home page Code Monkey logo

cosmo's Introduction

COSMO

COSMO is a variant of Boltzmann Machines that is used for contextualized scene modeling. It extends Boltzmann Machines by adding relation and affordance nodes between visible nodes. You can see the web page and the paper.

(a) An overview of the proposed model COSMO, where the tri-way edges are shown in red, and (b) some examples for what it can provide to a robot

(a) An overview of the proposed model COSMO, where the tri-way edges are shown in red, and (b) some examples for what it can provide to a robot

Introduction

Scene models allow robots to understand what is in the scene, what else should be in it and what shouldn't be in it according to the context of the scene. COSMO models the environment in terms of presence of objects; relations among them and affordances.

Objects, relations and affordances correspond to visible units of Boltzmann Machine; and context is represented by hidden units. Two object and one relation or affordance nodes are linked with triway edges. All visible nodes are connected to all hidden nodes and no connections within hidden nodes are allowed.

The COSMO has abilities that is related to scene modeling:

  • Relation estimation between objects,
  • Finding missing objects in the scene,
  • Finding irrelevant objects in the scene,
  • Affordance prediction,
  • Finding object that afford some specific action,
  • Finding subject that acts some action with specific object,
  • Random scene generation,

Dataset

We formed a dataset composed of 6,976 scenes, half of which is sampled from the Visual Genome (VG) dataset and the other half from the SUNRGBD dataset.

Our dataset consists of 90 objects that commonly exist in scenes, including human-like (man, woman, boy etc.), physical objects (cup, bottle, jacket etc.), part of buildings (door, window etc.). Object vocabulary is given in object_vocabulary.txt.

Our dataset is composed of the following eight spatial relations: left, right, front, behind, on-top, under, above, below. These spatial relations are annotated in the VG dataset already. However, we extended the original SUN-RGBD dataset by manually annotating these eight spatial relations. Moreover, we included verb-relations in the VG dataset as affordances into the dataset. The set of affordances include eat-ability, push-ability, play-ability, wear-ability, sit-ability, hold-ability, carry-ability, ride-ability, push-ability, use-ability.

COSMO_dataset.zip consists the dataset which is splitted into three parts: 60% for training, 30% for testing and 10% for validation. The dataset is organized as batches with 32-sized.

Batches starting with 's' (i.e s_batch_05_3.csv) contain samples from SUNRGBD dataset. Others (i.e batch_17) are from VG dataset.

info folder includes links the references to the images used in the dataset.

Using COSMO

Files

Description of files is here:

  • COSMO.py is implementation of COSMO by using Tensorflow and Python.
  • utils.py includes helper functions for COSMO implementation.
  • utils2.py includes additional helper functions for COSMO implementation.
  • zconfig.py includes configuration settings for COSMO implementation.
  • experiment_train.py is simple training script for COSMO.
  • experiment_test.py is simple testing script for COSMO.

Installation Requirements

Required packages are here:

  • python >= 2.7.12
  • tensorflow >= 1.4.1
  • numpy >= 1.13.3
  • pandas >= 0.20.3

Training

To train COSMO with specific dataset, you can use experiment_train.py file. Hyper-parameters of COSMO, paths to training and validation sets are specified in the file. Run the code to train COSMO:

python experiment_train.py

Testing

To run the tasks on test set, you can use experiment_test.py file:

python experiment_test.py

References

COSMO: Contextualized Scene Modeling with Boltzmann Machines. İlker Bozcan, Sinan Kalkan, submitted to the Robotics and Autonomous Systems (RAS) special issue on Semantic Policy and Action Representations for Autonomous Robots (SPAR), 2018.

Video

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.