Code Monkey home page Code Monkey logo

intracranial-hemorrhage-detection's Introduction

Howdy, I'm Kyle ๐Ÿ‘‹

๐ŸŒฑ currently growing TurnKey Trips and bringing healthcare accessability to rural America.

๐Ÿค“ teaching data science & cognitive science at UCSD

๐Ÿ’ฌ love connecting with people

๐ŸŽพ likely on a tennis court

Connect with me:

Kyle Shannon Kyle Shannon | LinkedIn

intracranial-hemorrhage-detection's People

Contributors

kshannon avatar utcsox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

intracranial-hemorrhage-detection's Issues

add random data augmentation to data loader

Rotation: could be bound to 90 degree increments or perhaps 10 degree...
Flip: yes this is a good option to include
Scaling: could random zooms work? possibly not need to be careful because we could crop out the actual hemorrhage thus losing class 1 data and misclassifying it....
Translation: could be maybe +/- 50px? how to interpolate area (constant, edge, reflect, wrap?)
Denoising/Noise: add Gaussian noise?

Training data augmentation. Real-time data augmentation (Fig. 3, M4) was performed by applying geometric transformations (rotation, scaling and translation) to make models learn invariant features to geometric perturbations. In addition, to improve invariance of the model to noise, either standard or denoised images was randomly selected to be used. We generated denoised images for standard cases by applying a median filter with a window size of 3 and used the scanner-generated denoised images if they already existed in the datasets. For the cases only with scanner-generated denoised images, only the denoised images were used as we were concerned about a bias that might be produced by reversing the denoising processes that are unknown to us. Rotation angles ranging from โ€“10o to 10o with an interval of 1o, scaling ratios of heights and widths ranging from 90% to 100% with an interval of 1%, translation parameters ranging from โˆ’ 12 to 12 pixels in x and y directions with an interval of 1 pixel, and a median filter with a window size of 3 were used for augmentation. All these parameters were randomly selected in the predefined ranges. Lee et. al. NATURE BIOMEDICAL ENGINEERING | VOL 3 | MARCH 2019 | 173โ€“182

Enhance predict script

Need to make use of the data loader to do batch processing of predictions and write them out to a csv file. Most of the infrastructure for this already exists currently. Might need to tweak the data loader slightly. Might be a good task for @utcsox to work.

define a model template script

scrpt should have a tf model defined along with consts that can be imported to the train.py script
which model-{custom-name}.py you use will be passed as an arg to the train.py folder

EDA

  • Please add an intracranial-hemorrhage-detection/eda/ directory at the top level and put *.ipynb eda scrips in this folder.
  • Tony, you had mentioned you did a first pass EDA on the data? If so you can upload it and maybe chris can do some more work on it once he gets the data?

Kaggle Team

Accept Tony/Chris on team. general housekeeping..

Update Dicom Reader Function

need to add the ability to recenter, window and level a dicom. These should be flagged and params set for the size fo the window/level.

3D ideas

Thoughts on maximizing 3d representation via interpolation w/o training a 3d model:

Slice inter-polation was introduced to mimic how radiologists integrate infor-mation from all adjacent images of a contiguous three-dimensional (3D) volume concurrently, rather than examine each single axial 2D slice in isolation. Interpolated images from adjacent slices were pro-vided to the model with a modified loss function during training to imitate the 3D integration of image interpretation by radiologists. Lee et. al. NATURE BIOMEDICAL ENGINEERING | VOL 3 | MARCH 2019 | 173โ€“182 |

Though note this salient point from the same paper about strict 3d voxel based approaches:

Another approach to address inter-slice dependency is to build a 3D network that directly inputs the voxel data from the entire imaging volume into a 3D format rather than as pixel-data from discrete axial slices in a 2D format. To compare the 3D versus 2D approaches, we trained a 3D model using previously described methodology21 by using case-level labels aggregated from slice-level labels, as well as volume data with a standardized dimensionality (24 ร— 512 ร— 512 voxels) generated using 2D slices. The resulting 3D model, however, achieved a mAP of only 0.328 for the multi-label classification of our five ICH subtypes, which is substantially infe-rior to the mAP we obtained with our existing 2D model (mAP of 0.686). This finding is consistent with the โ€˜curse of dimensionalityโ€™ reported in a previous study24, which noted that the amount of data required to train a deep-learning model scales exponentially with the dimensionality of the data.

Modeling Ideas

vgg16 and resnet50 as a starting point was a good idea, and it seems that other people in the field have taken similar courses of actions.

Our proposed system for the detection and classification of ICH uses multiple ImageNet6 pretrained deep convolutional neural networks (DCNNs), a preprocessing pipeline, an atlas creation module and a prediction-basis selection module (Fig. 1). The four DCNNs used for building our model are VGG167, ResNet-508, Inception-v39 and Inception-ResNet-v210." Lee et. al. NATURE BIOMEDICAL ENGINEERING | VOL 3 | MARCH 2019 | 173โ€“182 |

Enhance data loader

Might be nice for the data loader to asisst with:

  • upsample/downsample of class All or 0
  • good to consider being able to train only on class != 0
  • Good to be able to mini batch sample on class 1/0 also possibly on subclass level

we need to talk more about this soon as this will allow to apply more complex training schemes and modeling.

basic dataloader script

Need to implement the basic data loader. Shouldn't be too bad for this first version.

Tony, I am going to work on this today (Sat 9/28) and hopefully have something you can look at by tomorrow. If you can verify that it will work and look over the code that would be great.

CSV model logger

once a model is saved we can write to a csv file the model's name and specifications for what the model is and how it was trained e.g. batch size etc. perhaps even the time it took just to keep as a record.

data preprocessing

need to add true upsampling and mind distributions. Generate new train/validation CSVs for data loader

  • 7 col: id, subtype 1-5, all
  • remove duplicates (see comment below)
  • randomly downsample class 0 to match class 1
  • split into train/validation (look at bar plot distributions
  • create 3 sets of csvs [(balanced train/val), (class_any train/val), (class_subtypes train/val)] all using the same base test/train split.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.