Code Monkey home page Code Monkey logo

chest-ct-pretraining's Introduction

Video pretraining advances 3D deep learning on chest CTs

This repository contains code to train and evaluate models on the RSNA PE dataset and the LIDC-IDRI dataset for our paper Video pretraining advances 3D deep learning on chest CTs.

Table of Contents

  1. System Requirements
  2. Installation
  3. Datasets
  4. Usage
  5. Demo

System Requirements

Hardware requirements

The data processing steps requires only a standard computer with enough RAM to support the in-memory operations.

For training and testing models, a computer with sufficient GPU memory is recommended.

Software requirements

OS requirements

All models have been trained and tested on a Linux system (Ubuntu 16.04)

Python dependencies

All dependencies can be found in environment.yml

Installation

  1. Please install Anaconda in order to create a Python environment.
  2. Clone this repo (from the command-line: git clone [email protected]:rajpurkarlab/2021-fall-chest-ct.git).
  3. Create the environment: conda env create -f environment.yml.
  4. Activate the environment: source activate pe_models.
  5. Install PyTorch 1.7.1 with the right CUDA version.

Installation should take less than 10 minutes with stable internet.

Datasets

RSNA

Download dataset from: RSNA PE Dataset

Make sure to update PROJECT_DATA_DIR in pe_models/constants.py with path to the directory that contains the RSNA dataset.

Preprocessing

Please download the pre-processed label file that contains data split and DICOM header infomation using this link and place it in the RSNA data directory.

Alternatively, you can create the pre-processed file by running:

$ python pe_models/preprocess/rsna.py

Test

To ensure that the dataset is correct and that data are loading in the correct format, run the following unittest:

$ python -W ignore -m unittest

Note that this might take a couple of minutes to complete.

You can also visually inspect example inputs in data/test/ after the unittest is complete.

LIDC

Download dataset from TCIA Public Access into a PROJECT_DATA_DIR/lidc folder.

Preprocessing

Install pylidc and set up your ~/.pylidcrc file using the official installation instructions.

You can then create all the necessary pre-processed files by running:

$ python pe_models/preprocess/lidc.py

You can then set the type in an experiment YAML to lidc-window or lidc-2d to train on the LIDC dataset.

Usage

To train a model, run the following:

python run.py --config <path_to_config_file> --train

For more documentation, please run:

python run.py --help

To test a model, use the --test flag, making sure that either the --checkpoint flag is specified or that the config YAML contains a checkpoint entry:

python run.py --config <path_to_config_file> --checkpoint <path_to_ckpt> --test

To featurize all studies in a dataset (to run a 1d model for example), use the --test_split all flag

Example configs can be found in ./configs/

Run hyperparameter sweep with wandb

Example hyperparameter sweep configs for each model can be found in ./configs/

wandb sweep <path_to_sweep_config>
wandb agent <sweep-id>

Custom dataset:

To train/test model on custom datasets:

  1. Please ensure that your data adhere to the same format as the RSNA/LIDC dataset. (See Example)
  2. Create a dataloader similar to RSNA/LIDC in ./datasets and update ./datasets/init.py to include the name of your custom dataloader.
  3. Make sure the data.type in your config file points to the name of your dataloader.

Demo

To run train/test script on a simulated demo dataset, use:

python run.py --config ./data/demo/resnet18_demo.yaml --checkpoint <path_to_ckpt> --test

You should expect the following results:

{'test/mean_auprc': 0.9107142686843872,
 'test/mean_auroc': 0.9166666865348816,
 'test/negative_exam_for_pe_auprc': 0.9107142686843872,
 'test/negative_exam_for_pe_auroc': 0.9166666865348816,
 'test_loss': 0.6920164227485657,
 'test_loss_epoch': 0.6920164227485657}

With a GPU, this should take less than 10 minutes to run.

Citation

If our work was useful in your research, please consider citing

@article{ke2023video,
    title={Video Pretraining Advances 3D Deep Learning on Chest CT Tasks}, 
    author={Alexander Ke and Shih-Cheng Huang and Chloe P O'Connell and Michal Klimont and Serena Yeung and Pranav Rajpurkar},
    booktitle={Medical Imaging with Deep Learning},
    year={2023},
    eprint={2304.00546},
    archivePrefix={arXiv},
    primaryClass={eess.IV}
}

chest-ct-pretraining's People

Contributors

alexxke avatar marshuang80 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.