Code Monkey home page Code Monkey logo

elimbias's Introduction

Eliminating Biasing Signals in Lung Cancer Images for Prognosis Predictions with Deep Learning

This repository contains the necessary files to reproduce the results of paper "Eliminating Biasing Signals Lung Cancer Images for Prognosis Predictions with Deep Learning" by W.A.C. van Amsterdam, J.J.C. Verhoeff, P.A. de Jong, T. Leiner and M.J.C. Eijkemans; in Nature Digital Medicine, 2019

Replicating the experiments

See this release for the code that generated the published results

DOI

Please follow these steps to replicate the results as published. The original python scripts are (somewhat) self-explanatory. They do contain unused code that was useful during initial experiments, but was not used for the final publication

Installation

The easiest way to go about this is to create a new conda environment and install all dependencies using conda and pip

conda create --name elimbias
conda activate elimbias
conda install python=3.7.3 tqdm numpy pandas feather-format nibabel pillow scikit-learn tensorboard future seaborn
conda install -c pytorch pytorch=1.1.0 torchvision
pip install pyro-ppl==0.3.0 pypng pylidc

Pre-processing

Go to subfolder elimbias/preproces, follow steps in README there

The goal of these steps is to end up with a collection of images that are neural-network ready, and each have associated measurements (e.g. size and variance) that can be used in a structural causal model

The result is a data folder that contains the images separated in train / valid subfolders (test is optional but not default), with associated measurements in a labels.csv file

Data simulation

This is where the statistical association between the images and the 'clinical' data are simulated, based on a structural causal model and the measurements of the images.

  1. Define a structural causal model that will generate the data

    See experiments/sims/README.md for a short instruction to define a structural causal model See experiments/sims for an example csv file that defines a structural causal model

  2. Define a setting in the settings directory with a setting.json file that together with the structural causal model defines the experiment (see the example)

  3. After defining the SCM and setting, run simulate_data.py to create a dataset based on the SCM and sample images accordingly for the defined setting like so:

    python simulate_data.py --setting <mysetting>

    run without the --setting argument to replicate the published results, using the default setting

    This will create a data folder in the setting/mysetting folder. Here are the images stored, coupled with the simulated ground truth data that will be used for training and validation.

Running the models

To replicate, run:

python train.py

To run on your own simulated data:

python train.py --setting <mysetting>

To evaluate the CNNs ability to predict the ground truth measurements, run with:

python train.py --setting <mysetting> --fase feature

Result will be saved in the setting directory, with subfolders for each 'fase' (xybn: predict x, y and use bottleneck loss; feature: predict features)

experiments/base_model/params.json contains the hyperparameters that controls how train.py runs

Evaluation

Run Tensorboard in this directory for visualization of the results

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.