The corona_ct_classification's intro from boshen0

INFORMS 2020 QSR Data Challenge on "CT Scan Diagnosis for COVID-19"

We were selected as one of finalists for this challenge and won the runner-up award!!

The code of image classification on COVID dataset using pytorch on INFORMS 2020 QSR Data Challenge on COVID dataset. We use an ensemble model consisting of Densenet 121 and Residual Attention model. We first split 15% of the data into validation which is not used in the training process. And we select the model which has the highest validation accuracy. Densenet 121 is pretrained on ImageNet, and Residual Attention model is pretrained on Cifar-10. In training, we separately train these two pretrained models in an end-to-end manner. Then we extract features from the last 2nd layer, and perform another classifier on the learned concatenated features by these two models on the whole training dataset. Here we use SVM with random gaussian kernels.

Dependencies

Python3, Scikit-learn, torch (Please refer to requirement.txt)

Dataset

The data for this Data Challenge is selected from an open-source data set on COVID-19 CT images. The raw data have been divided into two subsets: training and test sets. The training dataset is provided to participants to develop their models. The training dataset consists of 251 COVID-19 and 292 non-COVID-19 CT images. In addition to the images, meta-information (e.g., patient information, severity, image caption) is provided in a spreadsheet. The details of the original dataset can be found in Zhao et al. (2020).

Curated Dataset

We extended this work by building a large lung CT scan dataset for COVID-19 curating data from 7 public datasets. The dataset and the dataset description are available in the following links: https://www.kaggle.com/maedemaftouni/large-covid19-ct-slice-dataset https://github.com/maftouni/Curated_Covid_CT.git

How to run

The training data is saved in data/training. If you want to use your own data, just replace everything in data/training. It contains two folders where one is COVID images, and another Non-Covid images. The test data should be put in data/test.

The performance might be a little different due to different performance of a certain seed on different devices.

data_prep.py

to train DenseNet121 model:

python Model_densenet121.py

to train residual_attention model:

python Model_residual_attention.py

to train the ensemble model:

python Model_Ensemble.py

Network Structure

Sample outputs

Sample classification results

Attention can be viewed, broadly, as a tool to focus the most on the most informative parts of the image:

Evaluation

Here we evaluate the performance of our best model on the training data.

Confusion Matrix

                  predict Covid       predict Non-Covid
Covid                 247                      4
Non-Covid              2                      290

Accuracy

Accuracy: 98.9%

boshen0 / corona_ct_classification Goto Github PK

corona_ct_classification's Introduction

INFORMS 2020 QSR Data Challenge on "CT Scan Diagnosis for COVID-19"

We were selected as one of finalists for this challenge and won the runner-up award!!

Dependencies

Dataset

Curated Dataset

How to run

Network Structure

Sample outputs

Evaluation

Versioning

Authors

Acknowledgments

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent