Code Monkey home page Code Monkey logo

medvqa's Introduction

Med-VQA

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset. Two of these are made on top of Facebook AI Reasearch's Multi-Modal Framework (MMF).

Model Name Accuracy Number of Epochs
Hierarchical Question-Image Co-attention 48.32% 42
MMF Transformer 51.76% 30
MMBT 86.78% 30

Test them for yourself!

Download the dataset from here and place it in a directory named /dataset/med-vqa-data/ in the directory where this repository is cloned.

MMF Transformer:

mmf_run config=projects/hateful_memes/configs/mmf_transformer/defaults.yaml     model=mmf_transformer     dataset=hateful_memes training.checkpoint_interval=100 training.max_updates=3000

MMBT:

mmf_run config=projects/hateful_memes/configs/mmbt/defaults.yaml     model=mmbt     dataset=hateful_memes training.checkpoint_interval=100 training.max_updates=3000

Heirarchical Question-Image Co-attention:

cd hierarchical \ 
python main.py

Dataset details:

Dataset used for training the models was the VQA-MED dataset taken from "ImageCLEF 2019: Visual Question Answering in Medical Domain" competition. Following are few plots of some statistics of the dataset.

Distribution of the type of questions in the dataset.
Plot of frequency of words in answer.

medvqa's People

Contributors

kshitij-ambilduke avatar

Stargazers

Wenting Chen avatar Zheng Yuan avatar  avatar  avatar Prakrut Kotecha avatar Arihant Gaur avatar Aneesh Shetye avatar Tanmay Pathrabe avatar Khurshed P. Fitter avatar

Watchers

 avatar

Forkers

rishika2110

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.