Note: The COVID-Net models provided here are intended to be used as reference models that can be built upon and enhanced as new data becomes available. They are currently at a research stage and not yet intended as production-ready models (not meant for direct clinical diagnosis), and we are working continuously to improve them as new data becomes available. Please do not use COVID-Net for self-diagnosis and seek help from your local health authorities.
Update 11/28/2021: We released a new training dataset with over 30,000 CXR images from a multinational cohort of over 16,400 patients. The dataset contains 16,490 positive COVID-19 images from over 2,800 patients. The COVIDx V9A dataset is for detection of no pneumonia/non-COVID-19 pneumonia/COVID-19 pneumonia, and COVIDx V9B dataset is for COVID-19 positive/negative detection. Update 10/19/2021: We released a new COVID-Net CXR-3 model for COVID-19 positive/negative detection which was trained and tested on the COVIDx8B dataset leveraging the new MEDUSA (Multi-scale Encoder-Decoder Self-Attention) architecture. Update 04/21/2021: We released a new COVIDNet CXR-S model and COVIDxSev dataset for airspace severity grading in COVID-19 positive patient CXR images. For more information on training, testing and inference please refer to severity docs. Update 03/20/2021: We released a new COVID-Net CXR-2 model for COVID-19 positive/negative detection which was trained on the new COVIDx8B dataset with 16,352 CXR images from a multinational cohort of 15,346 patients from at least 51 countries. The test results are based on the new COVIDx8B test set of 200 COVID-19 positive and 200 negative CXR images. Update 03/19/2021: We released updated datasets and dataset curation scripts. The COVIDx V8A dataset and create_COVIDx.ipynb are for detection of no pneumonia/non-COVID-19 pneumonia/COVID-19 pneumonia, and COVIDx V8B dataset and create_COVIDx_binary.ipynb are for COVID-19 positive/negative detection. Both datasets contain over 16000 CXR images with over 2300 positive COVID-19 images. Update 01/28/2021: We released updated datasets and dataset curation scripts. The COVIDx V7A dataset and create_COVIDx.ipynb are for detection of no pneumonia/non-COVID-19 pneumonia/COVID-19 pneumonia, and COVIDx V7B dataset and create_COVIDx_binary.ipynb are for COVID-19 positive/negative detection. Both datasets contain over 15600 CXR images with over 1700 positive COVID-19 images. Update 01/05/2021: We released a new COVIDx6 dataset for binary classification (COVID-19 positive or COVID-19 negative) with over 14500 CXR images and 617 positive COVID-19 images. Update 11/24/2020: We released CancerNet-SCa for skin cancer detection, part of the CancerNet initiatives. Update 11/15/2020: We released COVIDNet-P inference and evaluation scripts for identifying pneumonia in CXR images using the COVIDx5 dataset. For more information please refer to this doc. Update 10/30/2020: We released a new COVIDx5 dataset with over 14200 CXR images and 617 positive COVID-19 images. Update 09/11/2020: We released updated COVIDNet-S models for geographic and opacity extent scoring of SARS-CoV-2 lung severity and updated the inference script for an opacity extent scoring ranging from 0-8. Update 07/08/2020: We released COVIDNet-CT, which was trained and tested on 104,009 CT images from 1,489 patients. For more information, as well as instructions to run and download the models, refer to this repo. Update 06/26/2020: We released 3 new models, COVIDNet-CXR4-A, COVIDNet-CXR4-B, COVIDNet-CXR4-C, which were trained on the new COVIDx4 dataset with over 14000 CXR images and 473 positive COVID-19 images for training. The test results are based on the same test dataset as COVIDNet-CXR3 models. Update 06/01/2020: We released an inference script and the models for geographic and opacity extent scoring of SARS-CoV-2 lung severity. Update 05/26/2020: For a detailed description of the methodology behind COVID-Net based deep neural networks for geographic extent and opacity extent scoring of chest X-rays for SARS-CoV-2 lung disease severity, see the paper here. Update 05/13/2020: We released 3 new models, COVIDNet-CXR3-A, COVIDNet-CXR3-B, COVIDNet-CXR3-C, which were trained on a new COVIDx dataset with both PA and AP X-Rays. The results are now based on a test set containing 100 COVID-19 samples. Update 04/16/2020: If you have questions, please check the new FAQ page first.
COVID-Net CXR-2 for COVID-19 positive/negative detection architecture and example chest radiography images of COVID-19 cases from 2 different patients and their associated critical factors (highlighted in red) as identified by GSInquire.
The COVID-19 pandemic continues to have a devastating effect on the health and well-being of the global population. A critical step in the fight against COVID-19 is effective screening of infected patients, with one of the key screening approaches being radiology examination using chest radiography. It was found in early studies that patients present abnormalities in chest radiography images that are characteristic of those infected with COVID-19. Motivated by this and inspired by the open source efforts of the research community, in this study we introduce COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest X-ray (CXR) images that is open source and available to the general public. To the best of the authors' knowledge, COVID-Net is one of the first open source network designs for COVID-19 detection from CXR images at the time of initial release. We also introduce COVIDx, an open access benchmark dataset that we generated comprising of 13,975 CXR images across 13,870 patient patient cases, with the largest number of publicly available COVID-19 positive cases to the best of the authors' knowledge. Furthermore, we investigate how COVID-Net makes predictions using an explainability method in an attempt to not only gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening, but also audit COVID-Net in a responsible and transparent manner to validate that it is making decisions based on relevant information from the CXR images. By no means a production-ready solution, the hope is that the open access COVID-Net, along with the description on constructing the open source COVIDx dataset, will be leveraged and build upon by both researchers and citizen data scientists alike to accelerate the development of highly accurate yet practical deep learning solutions for detecting COVID-19 cases and accelerate treatment of those who need it the most.
For a detailed description of the methodology behind COVID-Net and a full description of the COVIDx dataset, please click here.
For a detailed description of the methodology behind COVIDNet CXR-S severity assessment, please click here.
For more information on COVIDNet-S deep neural networks for geographic extent and opacity extent scoring of chest X-rays for SARS-CoV-2 lung disease severity, please click here.
For a detailed description of the methodology behind COVIDNet-CT and the associated dataset of 104,009 CT images from 1,489 patients, please click here.
Currently, the COVID-Net team is working on COVID-RiskNet, a deep neural network tailored for COVID-19 risk stratification. Currently this is available as a work-in-progress via included train_risknet.py script, help to contribute data and we can improve this tool.
If you would like to contribute COVID-19 x-ray images, please submit to https://figure1.typeform.com/to/lLrHwv. Lets all work together to stop the spread of COVID-19!
If you are a researcher or healthcare worker and you would like access to the GSInquire tool to use to interpret COVID-Net results on your data or existing data, please reach out to [email protected] or [email protected]
Our desire is to encourage broad adoption and contribution to this project. Accordingly this project has been licensed under the GNU Affero General Public License 3.0. Please see license file for terms. If you would like to discuss alternative licensing models, please reach out to us at [email protected] and [email protected] or [email protected]
If there are any technical questions after the README, FAQ, and past/current issues have been read, please post an issue or contact:
If you find our work useful, can cite our paper using:
@Article{Wang2020,
author={Wang, Linda and Lin, Zhong Qiu and Wong, Alexander},
title={COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images},
journal={Scientific Reports},
year={2020},
month={Nov},
day={11},
volume={10},
number={1},
pages={19549},
issn={2045-2322},
doi={10.1038/s41598-020-76550-z},
url={https://doi.org/10.1038/s41598-020-76550-z}
}
Hi,
I am getting below error message frequently
"At least two variables have the same name: conv1_conv/bias"
when was trying to test pneumonia image and also i saw this error with normal and COVID-19 image but less frequently
version of tensor flow i have.
tensorboard = 1.14.0
tensorflow = 1.14.0
There are many things one may learn from your work here with CovidNet, as it is robust work that will reasonably help to push/democratize Ai in a positive direction.
One thing of note, is that there are expected regimes for which xray/ct based techniques, be it by human or by ai, are expected to be viable. Unless I am mistaken, I did not detect that in the CovidNet paper nor the CovidNet/repository's readme file.
I think a similar section from the repository seen in issue 55 after title "Preliminary Conclusion", concerning expected constraints on testing/diagnosis, should be considered for CovidNet.
I can't seem to reconcile the layer dimensions in the PDF. The first layer gives the dimensions of the input images in parentheses, so I assume the numbers in parentheses are the dimensions of what is passed to the next layer. If so, how does a 7x7 convolutional layer output 112x112x64 from an input of 224x224x3? Assuming step size and padding size are integers, this doesn't seem to work with the formula for calculating the output size of convolutions unless you have unreasonably huge padding.
For the conv1x1 layers the dimension gets cut in half, suggesting the step size is 2. However with a 1x1 filter this means you're dropping half of the pixels. Is this correct?
Furthermore, the first flatten layer is said to have a flattened dimension of 100352 - but that's what you'd get from just PEPX 4.3. However you also have PEPX 4.2, PEPX 4.1, and the last conv1x1 on the right all feeding into the flattened layer, which each have 100352 elements. so are these 4x100352 all flattened together, feeding a vector of 401408 elements into the first FC layer (as I would expect since they all come from the same input image), or are you treating them separately?
Could you please specify the PEPX layer dimensions?
Hi, while I'm trying to load model for inference or evaluation in jupyter, I always got this error.
DataLossError: Checksum does not match: stored 1497157360 vs. calculated on the restored bytes 2410561084
[[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
My full notebook is shown below
import numpy as np
import os, argparse
import cv2
import tensorflow as tf
Given the datasource is different for the two classes: COVID and Pneumonia/Normal, how do you validate that the model doesn't classify the data source, but actually classifies the presence of COVID-19?
Thanks for working on this project! This is very interesting and very impactful.
COVID-Net relies on a design pattern of projection-expansion-projection-extension (PEPX) throughout the network. I have beginner-level knowledge of computer vision, and I haven't seen this design pattern before.
Without loading in the model, what are the output dimensions of each layer in the PEPX module (Figure 2, top right box) for PEPX1.1? This would give me a better understanding of how dimension is changing within the module.
What is the intuition around the effectiveness of this design pattern? Are there some previous papers that use this design pattern for their core results?
Thanks for your work to help the people in need! Your site has been added! I currently maintain the Open-Source-COVID-19 page, which collects all open source projects related to COVID-19, including maps, data, news, api, analysis, medical and supply information, etc. Please share to anyone who might need the information in the list, or will possibly contribute to some of those projects. You are also welcome to recommend more projects.
As mentioned in the paper, the COVID_Net has been pretrained on ImageNet. Could you please upload the imageNet pretrained weights so we can reproduce your result
With reference to this line on main repository: "Motivated by this, a number of artificial intelligence (AI) systems based on deep learning have been proposed and results have been shown to be quite promising in terms of accuracy in detecting patients infected with COVID-19 using chest radiography images. However, to the best of the authors' knowledge, these developed AI systems have been closed source and unavailable to the research community for deeper understanding and extension, and unavailable for public access and use."
Emails were sent to one of the authors namely Alexander, but the main CovidNet github repository is yet to reflect/acknowledge the much earlier repository, which is easy and quickly doable. Why hasn't this been done?
Example chest radiography images of COVID-19 cases from 2 different patients and their associated critical factors (highlighted in red) as identified by GSInquire.
Can you kindly tell what is GSInquire and how it was instrumental in identifying associated critical factors?
I've managed to load the provided model, but I'm not sure how to proceed from there to actually use it on an image.
In #2@Vikramank mentioned a Flask app, so I assume it's possible to use the pre-trained models, but without more info or docs, I don't know how to do it.
EDIT: actually, digging a bit more into this, it seems that, without the actual Keras model, using the Tensorflow checkpoint is pretty hard :-? From keras-team/keras#5273 (comment):
Fundamentally, you cannot "turn an arbitrary TensorFlow checkpoint into a Keras model".
What you can do, however, is build an equivalent Keras model then load into this Keras model the weights contained in a TensorFlow checkpoint that corresponds to the saved model. In fact this is how the pre-trained InceptionV3 in Keras was obtained.
So, without more info, it seems pretty hard (or impossible?) to do :-?
Would it be possible to add a simple example script that (1) gets the path to an image as input (1) loads the module (2) uses it and outputs the "COVID probability"?
Any idea on the number of Trainable Parameters for COVID-Net CXR small and large?
The paper says the number of parameters in COVID-Net is 116.6 Million. However, keras network compiled in https://github.com/busyyang/COVID-19 which replicates the architecture in paper has 364.6 Million Parameters, which is more than 3 times the parameter.
is:issue Sir can you provide a simple code to load the model because i have tried many methods and all are showing some errors kindly also mention versions of tf. last error i got was :
{module 'tensorflow._api.v2.train' has no attribute 'import_meta_graph')
............... the code was as following :
import tensorflow as tf
new_graph = tf.Graph()
with tf.compat.v1.Session(graph=new_graph) as sess:
saver = tf.train.import_meta_graph('/content/models2/model.meta')
saver.restore(sess, "/content/models2/model")
i have also tried tf.compat.v1.Session and tf.Session but unable to load the model
For training the model, is there a script available that can separate the classes ( Normal, Pneumonia and Covid-19) based on the train and test text files?
I've managed to build the train and test data sets but they aren't labeled at the moment.
I am trying to inference for multiple input images (six of them) and getting below error
it works fine for 4 input images.
2020-04-14 07:43:21.737334: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-14 07:43:21.748542: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-04-14 07:43:21.748968: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f2a7c638260 executing computations on platform Host. Device
s:
2020-04-14 07:43:21.749010: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
Killed
Hi Dear. First of all, thank you so much for sharing the data and network. Though you have removed the duplicates from test_COVIDx.txt but as per me, there are some duplicate filenames in train_COVIDx.txt file. It is requested to please add the following function at the last of your create_COVIDx_v2.ipynb notebook. This function will resolve all the duplicate issues and will sort all of the images in train/test data to their respective subfolders (i.e., Normal, Pneumonia and COVID-19) as following.
I see that in Covidx2 you used only 3 images from Figure1 collections. Is there a reason for that, or was it just timing? Do you know if there are overlaps between Figure1 and ieee8023/covid-chestxray-dataset?
I've a doubt about Dataset RSNA Pneumonia Challenge.
I'm going to download dataset (4gb) for detection of pneumonia on my own NN model.
I was wondering if the dataset was paid or if there is any constraint over the term "challenge"
It looks like additional training and test examples were added but the Confusion Matrix and Results have not been updated to reflect this. I recommend either updating the results, or if the results are not available yet (possibly still training the new model?) a quick note added to make sure that there isn't confusion about the Confusion Matrix, which only shows 8 ground truth COVID-19 samples still. As there are two false positives in the Confusion Matrix, it's possible to assume that the results have been miscalculated with false negatives as false positives, which would reverse the precision and recall.
Apparently the model uses 4 softmax layers, but the latest dataset creation notebook only splits the cases into Normal, Pneumonia and Covid-19. There seems to be a mismatch here. What is the latest approach?
Hello, and thank you for sharing this great project.
I am testing the net on out-of-sample data, some known COVID or non-COVID images and having some troubles. These are my questions to the community if anybody could help:
Is it mandatory to filter the out-of-sample images to PA projections? Don't know if it is important or it supposed to work fine with AP too.
Is it needed to transform the image to RGB? On the README says the Net is expecting a 224,244,3 array and DICOM images are just grayscale, i'm trying with openCV libraries transformation, don't know if this is correct to handle DICOM files:
"img=cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)"
COVID19 -> Train (223 images), Test (31 images)
Normal -> Train (7966 images), Test (885 images)
Prenumonia -> Train (5451 images), Test (594 images)
Kindly confirm whether the distribution is correct.
Furthermore, do you have any benchmark result with the above data distribution? The benchmark presented in https://github.com/lindawangg/COVID-Net#results is with the lesser test samples. What version of data distribution do you recommend for comparison with the COVID-Net? Kindly advise.
I run the inference.py scipt on some "healthy" X ray images but the result was "Covid-19". I would like to check what the network is really classifing.
I read in the joint paper that the QSInquire method was used to have a visual control of results. Is the method available ? Does someone know how it works ?
Or have an alternative way of visualise the result ?
Why the evaluation results are different when loading model-8485 and model-10 (epoch 10)?
These are my results when running eval.py using COVIDNet-CXR-Large, test_COVIDx2.txt and model-10 (epoch 10)
I am not getting the same sensitivity values when trained for 15 more epochs. The sensitivity values are not retained for the 1st epoch itself!
I have used COVIDNet-CXR-Large model and the dataset files being train_COVIDx2.txt and test_COVIDx2.txt.
It has been mentioned in the paper that you have used a learning rate policy which reduces the learning rate if the learning is stagnated for a period of time. The factor and patience values, 0.7 & 5 respectively have also been mentioned in the paper. However, I did not come across any line in the code which implement this.
I have tried to train the model on the same dataset for another 30 epochs as well with different learning rates (2e-07 & 2e-08). The sensitivity kept dropping.
Am I missing something?
There seem to be some discordance between the number of training and testing samples mentioned in the RSNA pneumonia dataset versus the dataset distribution mentioned on the github page, probably because of the multiple rows corresponding to same patient ID in the RSNA dataset’s csv file as it was supposed to be a detection task, please verify.