Code Monkey home page Code Monkey logo

visual-relationship-detection-ltn's Introduction

LOGIC TENSOR NETWORKS FOR VISUAL RELATIONSHIP DETECTION

This repository contains the dataset, the source code and the models for the detection of visual relationships with Logic Tensor Networks.

Introduction

Semantic Image Interpretation is the task of extracting a structured semantic description from images. This requires the detection of visual relationships: triples (subject, relation, object) describing a semantic relation between the bounding box of a subject and the bounding box of an object. Here, we perform the detection of visual relationships by using Logic Tensor Networks (LTNs), a novel Statistical Relational Learning framework that exploits both the similarities with other seen relationships and background knowledge, expressed with logical constraints between subjects, relations and objects. The experiments are conducted on the Visual Relationship Dataset (VRD).

A detailed description of the work is provided in our paper Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation at IJCNN 2019:

 @inproceedings{donadello2019compensating,
  author    = {Ivan Donadello and Luciano Serafini},
  title     = {Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation},
  booktitle = {{IJCNN}},
  pages     = {1--8},
  publisher = {{IEEE}},
  year      = {2019}
}

Here a video shows a demo of the system.

Using the Source Code

  • The data folder contains the LTNs encoding of the VRD training and test set, the ontology that defines the logical constraints and the images of the VRD test set. Images and their annotations can be downloaded from https://cs.stanford.edu/people/ranjaykrishna/vrd/.
  • The models folder contains the trained grounded theories of the experiments;
  • The Visual-Relationship-Detection-master folder contains the object detector model and the evaluation code provided in https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection for the evaluation of the phrase, relationship and predicate detection tasks on the VRD.

Requirements

We train and test the grounded theories with the following software configuration. However, more recent versions of the libraries could also work:

  • Ubuntu 14.04;
  • Matlab R2014a;
  • Python 2.7.6;
  • TensorFlow 0.11.0;
  • Numpy 1.13.1;
  • Scikit-learn 0.18.1;
  • Matplotlib 1.5.1;

Training a grounded theory

To run a train use the following command:

$ python train.py
  • The trained grounded theories are saved in the models folder in the files KB_nc_2500.ckpt (no constraints) and KB_wc_2500.ckpt (with constraints). The number in the filename (2500) is a parameter in the code to set the number of iterations.

Evaluating the grounded theories

To run the evaluation use the following commands

$ python predicate_detection.py
$ python relationship_phrase_detection.py

Then, launch Matlab, move into the Visual-Relationship-Detection-master folder, execute the scripts predicate_detection_LTN.m and relationship_phrase_detection_LTN.m and see the results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.