Code Monkey home page Code Monkey logo

ch-icl's Introduction

Candidate-Heuristic In-Context Learning: A New Framework for Enhancing MedVQA with Large Language Models

💡Overview

CH-ICL is a candidate-heuristic framework, which transforms images into text using just a few trainable parameters and leverages the contextual understanding capability of LLMs to enhance the performance of existing medical VQA models. The proposed method is plug-and-play, and its effectiveness has been demonstrated on three existing medical VQA datasets.

overview

📔Pathology Terminology Dictionary

The keywords of our dataset images are widely diverse, including various image types, systems and organs, diseases, symptoms, staining techniques, etc.

example

distribution

🔨Setup

Requirement

conda create -n chicl python=3.8
conda activate chicl
pip install -r requirements.txt

🔨Pre-trained weights(Optional)

Download BiomedCLIP and place it in ./src/backbone/BiomedCLIP.

BiomedCLIP links:

Note: Directly downloading weights from Huggingface might encounter network issues. To facilitate modifications, we have converted the original .bin file to PyTorch's .pth. We recommend using the Baiduyun version.

📑Data Preparation

Our data mainly comes from publicly available, free online Pathology Education Informational Resource (PEIR) Digital Library. We test our model on:

Prepare BiomedCLIP Pre-extracted Image Feature

Note: We recommend using our pre-extracted BioMedCLIP features. The original images can also be found in the links below:

Dataset Pre-extracted Features & Original Images
PEIR Baiduyun
PathVQA Baiduyun
Slake Baiduyun
RADVQA Baiduyun

Training and Validation:

Below are the pre-trained keyword and VQA weights:

For PathVQA, as an example:

  • Download the features and place them in ./data/PathVQA. Execute:
python3 src/trainval.py \
        --dataset 'pathvqa' \
        --data_path './data/Annotations/PathVQA' \
        --feature_path './data/PathVQA'\
        --batch_size 128 \
        --freeze \
        --d_input 768 \
        --method 'biomed'

Testing

(You need to run test.py and test_peir.py separately to generate topk candidate and topk keyword), see test.sh:

python3 src/test.py \
        --dataset 'pathvqa' \
        --data_path './data/Annotations/PathVQA' \
        --feature_path './data/PathVQA'\
        --batch_size 128 \
        --visible \
        --method 'biomed' \
        --checkpoint 'your_checkpoint_path.pth'

python3 src/test_peir.py \
        --dataset 'pathvqa' \
        --data_path './data/Annotations/PathVQA' \
        --feature_path './data/PathVQA'\
        --batch_size 32 \
        --visible \
        --method 'biomed' \
        --checkpoint './checkpoints/peir/biomed_freeze/ckpt_best_model.pth'

The results pathvqa_results.json and pathvqa_keyword_results.json will be saved in the project's root directory.

Then replace the keyword and vqa result json path in tools/openai_api_test_keyword.py. Execute:

python3 tools/openai_api_test_keyword.py

📝Acknowledgements

We also reference the excellent repos of BioMedCLIP, PubMedCLIP, in addition to other specific repos to the baselines we examined (see paper).

📝Citation

If you find this paper useful, please consider staring 🌟 this repo and citing 📑 our paper:


ch-icl's People

Contributors

ecoxial2007 avatar martoluno avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.