Code Monkey home page Code Monkey logo

vqa-project's Introduction

Learning Conditioned Graph Structures for Interpretable Visual Question Answering

This code provides a pytorch implementation of our graph learning method for Visual Question Answering as described in Learning Conditioned Graph Structures for Interpretable Visual Question Answering

Model diagram

Examples of learned graph structures

Getting Started

Reference

If you use our code or any of the ideas from our paper please cite:

@article{learningconditionedgraph,
author = {Will Norcliffe-Brown and Efstathios Vafeias and Sarah Parisot},
title = {Learning Conditioned Graph Structures for Interpretable Visual Question Answering},
journal = {arXiv preprint arXiv:1806.07243},
year = {2018}
}

Requirements

Data

To download and unzip the required datasets, change to the data folder and run

$ cd data; python download_data.py

To preprocess the image data and text data the following commands can be executed respectively. (Setting the data variable to trainval or test for preprocess_image.py and train, val or test for preprocess_text.py depending on which dataset you want to preprocess)

$ python preprocess_image.py --data trainval; python preprocess_text.py --data train

Pretrained model

If you would like a pretrained model, one can be found here: example model. This model achieved 66.2% accuracy on test.

Training

To train a model on the train set with our default parameters run

$ python run.py --train

and to train a model on the train and validation set for evaluation on the test set run

$ python run.py --trainval

Models can be validated via

$ python run.py --eval --model_path path_to_your_model

and a json of results from the test set can be produced with

$ python run.py --test --model_path path_to_your_model

To see a list and description of the model training parameters run

$ python run.py --help

Authors

  • Will Norcliffe-Brown
  • Sarah Parisot
  • Stathis Vafeias

License

This project is licensed under the Apache 2.0 license - see Apache license

Acknowledgements

Our code is based on this implementation of the 2017 VQA challenge winner https://github.com/markdtw/vqa-winner-cvprw-2017

vqa-project's People

Contributors

wnorcbrown avatar

Watchers

Bater.Makhabel avatar James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.