Code Monkey home page Code Monkey logo

shyamalschandra / iccv-cdpath2021-id-8 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from srinidhipy/iccv-cdpath2021-id-8

0.0 1.0 0.0 1.22 MB

Official code for "Improving Self-supervised Learning with Hardness-aware Dynamic Curriculum Learning: An Application to Digital Pathology", ICCV, 2021, CDpath Workshop (Oral).

Home Page: https://openaccess.thecvf.com/content/ICCV2021W/CDPath/html/Srinidhi_Improving_Self-Supervised_Learning_With_Hardness-Aware_Dynamic_Curriculum_Learning_An_Application_ICCVW_2021_paper.html

License: MIT License

Python 100.00%

iccv-cdpath2021-id-8's Introduction

Improving Self-Supervised Learning with Hardness-aware Dynamic Curriculum Learning: An Application to Digital Pathology

Overview

In this work, we attempt to improve self-supervised pretrained representations through the lens of curriculum learning by proposing a hardness-aware dynamic curriculum learning (HaDCL) approach. To improve the robustness and generalizability of SSL, we dynamically leverage progressive harder examples via easy-to-hard and hard-to-very-hard samples during mini-batch downstream fine-tuning. We discover that by progressive stage-wise curriculum learning, the pretrained representations are significantly enhanced and adaptable to both in-domain and out-of-domain distribution data.

We carry out extensive validation experiments on three histopathology benchmark datasets on both patch-wise and slide-level classification tasks:

Method

Results

  • Predicted tumor probability heat-maps on Camelyon16 (in-domain) and MSKCC (out-of-domain) test sets

Pre-requisites

Core implementation:

  • Python 3.7+
  • Pytorch 1.7+
  • Openslide-python 1.1+
  • Albumentations 1.8+
  • Scikit-image 0.15+
  • Scikit-learn 0.22+
  • Matplotlib 3.2+
  • Scipy, Numpy (any version)

Datasets

Training

The model training consists of three steps:

  1. Self-supervised pretraining (i.e., Our earlier proposed method Resolution sequence prediction (RSP) and Momentum Contrast (MoCo))
  2. Curriculum-I fine-tuning (easy-to-hard)
  3. Curriculum-II fine-tuning (hard-to-very-hard)

1. Self-supervised pretraining

In this work, we build our approach on our previous work "Self-Supervised driven Consistency Training for Annotation Efficient Histopathology Image Analysis. Please, refer to our previous repository for pretraining details on whole-slide-images. We have included the pretrained model (cam_SSL_pretrained_model.pt) for Camelyon16, found in the "models" folder.

2, 3. Fine-tuing of pretrained models on the target task using hardness-aware dynamic curriculum learning

  1. Download the self-supervised pretrained model from the models folder.
  2. Download the desired dataset; you can simply add any other dataset that you wish.
  3. For slide-level classification tasks, run the following command by the desired parameters. The arguments can be set in the corresponding files.
python ft_Cam_SSL.py  // Supervised fine-tuning on Camelyon16    
python ft_Cam_SSL_CL_I.py    // Curriculum-I fine-tuning on Camelyon16
python ft_Cam_SSL_CL_II.py    // Curriculum-II fine-tuning on Camelyon16

We have included the fine-tuned models (cam_curriculum_I_finetuned_model.pt, cam_curriculum_II_finetuned_model.pt) for Camelyon16, found in the "models" folder. These models can be used to test the predictions on Camelyon16 and MSKCC test sets.

  1. For patch-level classification tasks, run the following command by the desired parameters. The arguments can be set in the corresponding files.
python ft_MHSIT_SSL.py  // Supervised fine-tuning on MHIST    
python ft_MHSIT_SSL_CL_I.py    // Curriculum-I fine-tuning on MHIST
python ft_MHSIT_SSL_CL_II.py    // Curriculum-II fine-tuning on MHIST

Testing

The test performance is validated at two stages:

  1. Patch-level predictions: The patch-level predictions can be performed to generate tumor probability heat-maps on Camelon16 and MSKCC datasets. Note: MSKCC doesn't contain any training images and hence, we use this dataset as an external validation set to test our method's performance on out-of-distribution data.
prob_map_generation.py  // tumor probability heat-map on Camelyon16, MSKCC 
eval_MHIST.py  // patch-wise predictions on MHIST 
  1. Random-forest-based slide-level classifier: The final slide-level predictions can be performed using scripts inside the "Slide_Level_Analysis" folder.
extract_feature_heatmap.py  // to extract geometrical features from the heatmap predictions of the previous stage   
wsi_classification.py  // Random-forest based slide-level classifier

License

Our code is released under MIT license.

Citation

If you find our work useful in your research or if you use parts of this code please consider citing our papers:

@article{srinidhi2021improving,
  title={Improving Self-supervised Learning with Hardness-aware Dynamic Curriculum Learning: An Application to Digital Pathology},
  author={Srinidhi, Chetan L and Martel, Anne L},
  journal={arXiv preprint arXiv:2108.07183},
  year={2021}
}
@article{srinidhi2021self,
  title={Self-supervised driven consistency training for annotation efficient histopathology image analysis},
  author={Srinidhi, Chetan L and Kim, Seung Wook and Chen, Fu-Der and Martel, Anne L},
  journal={arXiv preprint arXiv:2102.03897},
  year={2021}
}

Acknowledgements

This work was funded by Canadian Cancer Society and Canadian Institutes of Health Research (CIHR). It was also enabled in part by support provided by Compute Canada (www.computecanada.ca).

Questions or Comments

Please direct any questions or comments to me; I am happy to help in any way I can. You can email me directly at [email protected].

iccv-cdpath2021-id-8's People

Contributors

srinidhipy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.