Code Monkey home page Code Monkey logo

pathology_adversarial's Introduction

Pathology_Adversarial

Overview

This repository contains the Python version of a general workflow for Adversarial attacks and attack-proof artificial intelligence models in computational pathology. It is based on workflows which were previously described in Kather et al., Nature Medicine 2019 and Ghaffari Laleh et al. 2021. The objective is to predict a given label directly from digitized histological whole slide images (WSI) under multiple adversarial attacks. The label is defined on the level of patients, not on the level of pixels in a given WSI. The AI models can be trained normaly or robustly to adversarial attack, by defining the advTrain flag in the experiment file. Also the strength of attack during training can be defined by the value of epsilon in the experiment file.

This is important to notice that this repository follows exactly the same procedure as https://github.com/KatherLab/HIA. So it is recommended to check the initial repository, to be able to use the pathology_adversarial scripts.

Example data

The example dataset which has been pre-processed based on the modifications explained in this study, can be found in https://zenodo.org/record/5337009. This tiles are derived from the TCGA-BRCA breast cancer histology dataset at https://portal.gdc.cancer.gov/ (please check this website for the original data license).

System requirements

Hardware requirements

The code in this repository requires a CUDA-enabled NVIDIA GPU for a fast and convenient training. However, it will automatically detect the present of the GPU in the system and run the rest of the codes correspondingly.

Software requirements

OS Dependencies

The scripts in this repository have been developed on windoes server 2019 (version 1809).

Python Dependencies (python 3.10.6)

This repository mainly depends on the following packages:

Pytorch (1.11.0+cu113)
Scikit-learn (1.1.2)
Numpy (1.23.1)
Pandas (1.4.3)
OpenCV (4.6.0)
pytorch_pretrained_vit (0.0.7)
pickle (0.7.5)
efficientnet_pytorch (0.7.1)
torchvision (0.12.0+cu113)

Installation guide

To use this repository, it is recommended to check the https://github.com/KatherLab/HIA. The structure of the experiment file is same as it is been explained in HIA repository. Please install Python (e.g. Anaconda) on your windows system, and run the experiemnt file, filled with all the required information in the Main.py script. No installation required. The training and deployment time can vary based on the dataset size and the computational power of the system.

Demo and Instructions for use

To start training, you need to download the code in this repository and fill the experiment file, with the adresses to your training and in case to your validation set. It is necessary to define the target label which should be present in the clinical table. The number of epochs, batch size, learning rate can be defined in the experiment file. You can set the value of AdvTrain if you want to train your model adversarially robust and set it to False if you want to train it with the normal model. For all the othe details in experiement file, we would recomment to check the https://github.com/KatherLab/HIA. Here is the example of experiment file:

{
    "projectDetails":"This is the demo for adversarially, 3 fold-cross validation training"!",
    "dataDir_train":["D:\\Path to the folder containing the training data set."], # This folder contains subfolder for each WSI with the extracted tiles. 
    "dataDir_test":["E:\\ PAth to the folder contaning the test data set. "], # This is only required if you run the deployment script.

    "targetLabels":["RCC subtyping"], # This is the name of column in the clinical table which we want to use as a prediction label.
    "trainFull":"False", # Set to True, if you want to use all the data to train a model and then use this trained model for deployement. 
    "numPatientToUse" : "ALL", # You can use a portion of the patient for the training. When you set it to 'ALL', it will use all the patients present in the clinical table. 

    "maxNumBlocks":100, # Maximum number of tiles to select for each WSI. 
    "minNumBlocks" : 8, # Minimum number of tiles to which needs to be present in the WSI.

    "epochs":50, # Maximum number of epochs to train the model. If earlyStop set to be True, the training can be stopped before reaching this number. 
    "batchSize":128,
    "freezeRatio" : 0.5,
    "repeatExperiment" : 5, #The rxperiemnt will be repeated 5 times with different random seeds. 
     
    "modelName":"vit", 
    "opt":"adam",
    "lr":0.0001,
    "reg":0.00001,
    "gpuNo":1,

    "earlyStop":"True",
    "minEpochToTrain":10, 
    "patience":5,

    "advTrain" : "True", #If set to be True, then the model will be trained robustly. 
    "epsilon" : 0.005,
    "attackName" : "AdvDrop",
    "alpha" : 0.0025,
    "maxNoIteration" : 10,
    "perturbationType" : "linf"
}

Expected output

This repository mainly gives the area under the curve values as a output during the deployment. If you use k-fold cross entropy option, it will print the AUC values after each fold for the test set and at the end, it will concatenate the result of all folds. For a single deployment, it will print the AUC values at the end of deployment. It also saves all the results in a report.txt file with the confidence intervals for the calculated AUC values.

License

This project is covered under the MIT license.

pathology_adversarial's People

Contributors

jnkather avatar narminghaffari avatar

Stargazers

 avatar  avatar

Watchers

 avatar

pathology_adversarial's Issues

attack module

How can I install the attack module in training.py? I got this error: ModuleNotFoundError: No module named 'attack'
there is no attack module for pip install.
I have tried py-attack but got this error: ImportError: cannot import name 'FastGradientSignUntargeted' from 'py_attack'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.