Howdy, I'm Kyle ๐
๐ฑ currently growing TurnKey Trips and bringing healthcare accessability to rural America.
๐ค teaching data science & cognitive science at UCSD
๐ฌ love connecting with people
๐พ likely on a tennis court
Repo to preform intracranial hemorrhage detection using data from RSNA's Medical Imaging competition.
License: MIT License
๐ฑ currently growing TurnKey Trips and bringing healthcare accessability to rural America.
๐ค teaching data science & cognitive science at UCSD
๐ฌ love connecting with people
๐พ likely on a tennis court
Goal is to get a baseline model trained and submitted this week.
Rotation: could be bound to 90 degree increments or perhaps 10 degree...
Flip: yes this is a good option to include
Scaling: could random zooms work? possibly not need to be careful because we could crop out the actual hemorrhage thus losing class 1 data and misclassifying it....
Translation: could be maybe +/- 50px? how to interpolate area (constant, edge, reflect, wrap?)
Denoising/Noise: add Gaussian noise?
Training data augmentation. Real-time data augmentation (Fig. 3, M4) was performed by applying geometric transformations (rotation, scaling and translation) to make models learn invariant features to geometric perturbations. In addition, to improve invariance of the model to noise, either standard or denoised images was randomly selected to be used. We generated denoised images for standard cases by applying a median filter with a window size of 3 and used the scanner-generated denoised images if they already existed in the datasets. For the cases only with scanner-generated denoised images, only the denoised images were used as we were concerned about a bias that might be produced by reversing the denoising processes that are unknown to us. Rotation angles ranging from โ10o to 10o with an interval of 1o, scaling ratios of heights and widths ranging from 90% to 100% with an interval of 1%, translation parameters ranging from โ 12 to 12 pixels in x and y directions with an interval of 1 pixel, and a median filter with a window size of 3 were used for augmentation. All these parameters were randomly selected in the predefined ranges. Lee et. al. NATURE BIOMEDICAL ENGINEERING | VOL 3 | MARCH 2019 | 173โ182
right now predict.py processes one image at a time, would be nice to run these as a batch. Current avg run time is about 1:45.
Need to make use of the data loader to do batch processing of predictions and write them out to a csv file. Most of the infrastructure for this already exists currently. Might need to tweak the data loader slightly. Might be a good task for @utcsox to work.
scrpt should have a tf model defined along with consts that can be imported to the train.py script
which model-{custom-name}.py you use will be passed as an arg to the train.py folder
intracranial-hemorrhage-detection/eda/
directory at the top level and put *.ipynb
eda scrips in this folder.Accept Tony/Chris on team. general housekeeping..
need to add the ability to recenter, window and level a dicom. These should be flagged and params set for the size fo the window/level.
Thoughts on maximizing 3d representation via interpolation w/o training a 3d model:
Slice inter-polation was introduced to mimic how radiologists integrate infor-mation from all adjacent images of a contiguous three-dimensional (3D) volume concurrently, rather than examine each single axial 2D slice in isolation. Interpolated images from adjacent slices were pro-vided to the model with a modified loss function during training to imitate the 3D integration of image interpretation by radiologists. Lee et. al. NATURE BIOMEDICAL ENGINEERING | VOL 3 | MARCH 2019 | 173โ182 |
Though note this salient point from the same paper about strict 3d voxel based approaches:
Another approach to address inter-slice dependency is to build a 3D network that directly inputs the voxel data from the entire imaging volume into a 3D format rather than as pixel-data from discrete axial slices in a 2D format. To compare the 3D versus 2D approaches, we trained a 3D model using previously described methodology21 by using case-level labels aggregated from slice-level labels, as well as volume data with a standardized dimensionality (24 ร 512 ร 512 voxels) generated using 2D slices. The resulting 3D model, however, achieved a mAP of only 0.328 for the multi-label classification of our five ICH subtypes, which is substantially infe-rior to the mAP we obtained with our existing 2D model (mAP of 0.686). This finding is consistent with the โcurse of dimensionalityโ reported in a previous study24, which noted that the amount of data required to train a deep-learning model scales exponentially with the dimensionality of the data.
vgg16 and resnet50 as a starting point was a good idea, and it seems that other people in the field have taken similar courses of actions.
Our proposed system for the detection and classification of ICH uses multiple ImageNet6 pretrained deep convolutional neural networks (DCNNs), a preprocessing pipeline, an atlas creation module and a prediction-basis selection module (Fig. 1). The four DCNNs used for building our model are VGG167, ResNet-508, Inception-v39 and Inception-ResNet-v210." Lee et. al. NATURE BIOMEDICAL ENGINEERING | VOL 3 | MARCH 2019 | 173โ182 |
Might be nice for the data loader to asisst with:
we need to talk more about this soon as this will allow to apply more complex training schemes and modeling.
Need to implement the basic data loader. Shouldn't be too bad for this first version.
Tony, I am going to work on this today (Sat 9/28) and hopefully have something you can look at by tomorrow. If you can verify that it will work and look over the code that would be great.
brain W:80 L:40
subdural W:130-300 L:50-100
stroke W:8 L:32 or W:40 L:40 3
soft tissues: W:350โ400 L:20โ60 4
once a model is saved we can write to a csv file the model's name and specifications for what the model is and how it was trained e.g. batch size etc. perhaps even the time it took just to keep as a record.
Following from some of the work Jeremy Howard did in his kaggle kernel. Some random HU DICOM histograms:
need to add true upsampling and mind distributions. Generate new train/validation CSVs for data loader
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.