Code Monkey home page Code Monkey logo

pneumonia_detection's Introduction

Pneumonia Detection

This repository contains code for pneumonia detection using X-ray images of the lungs. Test accuracy achieved (highest) : 96.5 %

Link for the dataset

Link to the trained models

Instructions

  • Install required packages from requirements.txt
  • You can download, extract and move the data according to the label using data.py
    • pip install kaggle
    • Download Kaggle API key from your Kaggle Account. Go to www.kaggle.com -> My Account -> Create New API token
    • Place the file in on your home directory $(HOME)/.kaggle/kaggle.json.
    • Kaggle module will look for this token at ~/.kaggle/kaggle.json on Linux, OSX, and other UNIX-based operating systems, and at C:\Users\<windows-username>\.kaggle\kaggle.json on Windows.
    • execute python data.py
  • Run main.py to train from the dataset. For example, python main.py --base_model {base_model} --optimizer {optimizer} --learning_rate {learning_rate} --batch_size {batch_size} --epoch {epoch} --nvidiadali --colab
    • To change pretrained base model, give input while initializing the model object. Use values from
      • ResNet18
      • ResNet34
      • ResNet50
      • ResNet101
      • ResNet152
      • Alexnet
      • VGG11
      • VGG13
      • VGG16
      • VGG19
      • GoogleNet
      • Inception
    • If using colab for training, mount the drive and use --colab to save the files in the drive
    • For faster training and Image augmentation, if on Linux, NVIDIA DALI can be used
      • Install NVIDIA DALI. Please see the installation page for details.
      • Use --nvidiadali flag to use it
      • Note that NVIDIA DALI is only available for Linux
    • To change optimizers, use one of the following
      • Adam
      • SGD
      • RMSprop
      • Adagrad
      • Adadelta
  • To test new dataset, run test.py with directory path and base model with option given above. Make sure that model is trained on those models first.
    • Example, python3 test.py "./data/Corona_Classification_data/test/" Inception True True
  • To generate a Class Activation Map from a trained model, after training it, use CAM.py like python3 CAM.py {Model_Name} {Path_to_Image}
    • Example, python3 CAM.py Inception "./data/abc.jpg"
    • This will save the output as CAM_{Model_Name}.jpg in the current directory

pneumonia_detection's People

Contributors

dependabot[bot] avatar shrey-viradiya avatar tirth-hihoriya avatar tirth1306 avatar tirthasheshpatel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pneumonia_detection's Issues

Creating a Best Vote Classifier

We can merge all the trained models to create a best vote classifier.

In other words, we can create a model that outputs predictions that are an average of all the trained models. This will make the output more robust. We can do that simply by loading the saved models and predicting using them.

Data Visualization

  • For better readability and understanding of our dataset and model.

  • Raw data viz

  • Count plots of different categories

  • Variability, Eigenimages

  • Contrast Between Average Images

  • Architecture of our model

IMP: DON'T MAKE PUBLIC!!!

I have uploaded our predictions on hidden test set given by Rupal mam. So, don't make it public unless all evaluations are complete!!

Progress Track

This is keep a track of the progress:

  • Finalize the model : Currently it is pretrained googlenet. Other options are pretrained resnet50, pretrained resnet100, or pretrained inceptionnet.
  • Data Preprocesing : data.py does some data cleaning but we may need to preprocess it too (IFF WE HAVE TIME)
  • Training the Model :
    • Checkpoint at 90% (test set) accuracy
    • Checkpoint at (90+2)% (test set) accuracy
    • Checkpoint at (90+5)% (test set) accuracy
    • Checkpoint at (90+7)% (test set) accuracy

Any other tasks we need to track??

Suggestion

As we are implementing Transfer Learning in our project, we should also build a model from scratch that will help us a lot to understand some basic but important concepts. Let me know your views on it. Or let's have a meeting!

Separating model objects in their own branch

Separating model objects in their own branch will make it easier to pull the repository as these large objects will not be included in the pull. We can then pull the required object from its respective branch. How does it sound?

Dataset Labels

Our dataset has nested labels. If it is normal then okay, but if it is cases of pneumonia then whether it is bacteria or virus. If virus then COVID or other.

look at the table:

Label Label_1_Virus_category Label_2_Virus_category Image_Count
Normal - - 1576
Pnemonia Stress-Smoking ARDS 2
Pnemonia Virus - 1493
Pnemonia Virus COVID-19 58
Pnemonia Virus SARS 4
Pnemonia bacteria 2772
Pnemonia bacteria Streptococcus 5

data.zip

Additional Feature

  • Need to add Class Activation Mapping
  • Additional model testing on other dataset from kaggle

Documentation

We have no documentation of the trained models or our python files. @tirth-hihoriya had concerns. Let's keep documentation after we finish all the training and testing...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.