The reference code of Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation.
- CNN-RNN-RNN (Liu et al., 2019)
- Knowing When to Look (Lu et al., 2017)
- Meshed-Memory Transformer (Cornia et al., 2020)
- Show, Attend and Tell (Xu et al., 2015)
- TieNet (Wang et al., 2018)
- MIMIC-CXR-JPG (Johnson et al., 2019)
- Open-i (Demner-Fushman et al., 2012)
NOTE : We are working to make the radiology NLI dataset publicly available.
- A Linux OS (tested on Ubuntu 16.04)
- Memory over 24GB
- A gpu with memory over 12GB (tested on NVIDIA Titan X and NVIDIA Titan XP)
Create a conda environment
$ conda env create -f environment.yml
NOTE
: environment.yml
is set up for CUDA 10.1 and cuDNN 7.6.3. This may need to be changed depending on a runtime environment.
- Download MIMIC-CXR-JPG
- Make a resized copy of MIMIC-CXR-JPG using resize_mimic-cxr-jpg.py (MIMIC_CXR_ROOT is a dataset directory containing mimic-cxr)
$ python resize_mimic-cxr-jpg.py MIMIC_CXR_ROOT
- Create the sections file of MIMIC-CXR (mimic_cxr_sectioned.csv.gz) with create_sections_file.py
- Move mimic_cxr_sectioned.csv.gz to MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/
Pre-calculate document frequencies that will be used in CIDEr by:
$ python cider-df.py MIMIC_CXR_ROOT mimic-cxr_train-df.bin.gz
Pre-recognize named entities in MIMIC-CXR by:
$ python ner_reports.py --stanza-download MIMIC_CXR_ROOT mimic-cxr_ner.txt.gz
Download pre-trained CheXpert weights, pre-trained radiology NLI weights, and GloVe embeddings
$ cd resources
$ ./download.sh
First, train the Meshed-Memory Transformer model with an NLL loss.
# NLL
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --cider-df mimic-cxr_train-df.bin.gz --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll
Second, further train the model a joint loss using the self-critical RL to achieve a better performance.
# RL with NLL + BERTScore + EntityMatchExact
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchExact --rl-weights 0.01,0.495,0.495 --entity-match resources/mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emexact
# RL with NLL + BERTScore + EntityMatchNLI
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchNLI --rl-weights 0.01,0.495,0.495 --entity-match resources/mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emnli
A training result can be checked with TensorBoard.
$ tensorboard --logdir out_m2trans_nll-bs-emnli/log
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.0.0 at http://localhost:6006/ (Press CTRL+C to quit)
See LICENSE and clinicgen/external/LICENSE_bleu-cider-rouge-spice for details.