Code Monkey home page Code Monkey logo

respiratory-disease-detection's Introduction

Respiratory Disease Detection

Keras scikit-learn NumPy Pandas TensorFlow

Respiratory sounds are important indicators of respiratory health and respiratory disorders. The sound emitted when a person breathes is directly related to air movement, changes within lung tissue and the position of secretions within the lung.

This digital data opens up the possibility of using machine learning to automatically diagnose respiratory disorders like asthma, pneumonia and bronchiolitis, to name a few.

Dataset

The Respiratory Sound Database was created by two research teams in Portugal and Greece. It includes 920 annotated recordings of varying length - 10s to 90s. These recordings were taken from 126 patients. There are a total of 5.5 hours of recordings containing 6898 respiratory cycles - 1864 contain crackles, 886 contain wheezes and 506 contain both crackles and wheezes. The data includes both clean respiratory sounds as well as noisy recordings that simulate real life conditions. The patients span all age groups - children, adults and the elderly.

This Kaggle dataset includes:

  • 920 .wav sound files
  • 920 annotation .txt files
  • A text file listing the diagnosis for each patient
  • A text file explaining the file naming format
  • A text file listing 91 names (filename_differences.txt )
  • A text file containing demographic information for each patient

The Dataset can be found on Kaggle here.

Notebooks

Preprocessing

As mentioned the respiratory cycle timings are mentioned in annotated files, so we extract parts of audio files which consist of respiratory cycles of a patient. Also, each respiratory cycle is not so same length, they vary from 10s to 90s, some of which are outliers as well, so we find mean length and trim all cycles to this mean length using Librosa module for loading audio files and Soundfile module for writing to output path of new audio files created.

Balancing

After preprocessing the dataset will be imbalanced towards one or more classes this inbalance must be dealt with using different techniques, this notebook uses stratify parameter of sklearn train_test_split to balance both test and train datasets on equal fraction of classes.

Feature Extraction

Extracting features from audio sample is a task of its on, so using 3rd parties libraries help out a lot, here Librosa is used to extract three most important features namely MFCC, CHROMA STFT and MEL SPECTROGRAM

Modeling

Each feature is an matrix of frequencies which is treated as an heatmap of frequencies and passed through different model paths for specific training and then concatenated for general optimisation and classification. The model is Keras Function API Convotional Model with BatchNormalization and ReLU activation.

Some Results

img

img

img

respiratory-disease-detection's People

Contributors

akshat26akd avatar shivam-316 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.