Code Monkey home page Code Monkey logo

lupi's Introduction

LUPI

Learning using Privileged Information for Chemosensitivity Prediction in Ovarian Cancer Patients. CNN Architecture

Whole slide images (WSI's) and gene expression data is taken from GDC Data Portal. Whole slide images can be download from here. ID’s of each WSI used in this project is available in ‘data_files/imgs.csv’ file. To train and validate the models one would need tiles extracted from WSIs, gene expression profiles and their respected labels for each patient.

1-pre-processing: WSI’s are very large in size and it is difficult to process the entire image. So, we did extract the top 3 tiles from each slide. Preprocessing code can be downloaded from here. This tutorial contain three files ‘slide.py’, ‘filter.py’ and ‘tiles.py’.

Before executing ‘slide.py’ file, add path to WSIs directory and the directory where you want to save the tiles on line 32,143 and 751. This file will create low resolution image of each whole slide image. No changes required in ‘filter.py’ file. In file ‘tiles.py’ change the size of tile to 1536x2048 and top tiles to 3 on line 38, 39 and 40 respectively. After this modification code can be executed to generate top 3 tiles from each WSI. Slide.py->filter.py->tiles.py

2-Stain normalization: Tiles extracted from WSIs needed to be stain normalized to remove the color variations. To do that update the local paths in ‘normalization.py’ and execute it.

After above two steps one will have normalized top three tiles for each patient. Add these tiles path to ‘imgs.csv file’.

3-data_files: This folder contain following three files:

I. imgs.csv

II. genes.csv

III. labels.csv

imgs.csv: This file contains path to tiles for each patient used in this project and which need to be updated as per your local path.

gene.csv: In this file each column represent patient’s gene profile, which is already preprocessed as discussed in paper.

labels.csv: This file contain labels (-1,1) for each patient. -1 represent chemo-resistant and +1 represent chemo-sensitive patients

NOTE: gene.csv and labels.csv files have data in same sequence as imgs.csv file have patient ID’s which will be used later for training. Also change the local path to tiles directory in imgs.csv file as needed.

So by now you will have all data required to train and validate the models. Each model will use the ‘imgs.csv’, ‘genes.csv’, ‘labels.csv’ and 'loader.py' files to load the data and 'patch_extractor.py' to extract patches from tiles, so make sure you update the local paths in ‘imgs.csv’ file.

To train the model simply execute the model file e.g. privileged_model.py which will load the genes profiles of each patient from 'genes.csv'and labels from 'labels.csv' for training and validation of privileged space model. Also one can use the ‘testing.py’ file to use the already input and lupi trained model.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.