This project is implemented as a part of the Data Science Practicum (CSCI 8360) course at the University of Georgia, Spring 2019. The goal was to develop an image segmentation pipeline that identifies as many of the neurons present as possible, as accurately as possible.
Please refer Wiki for more details on our approach.
The following instructions will assist you get this project running on any machine for developing and testing purpose.
-
Python:
To install Python, go here -
PyTorch:
To install PyTorch, usepip3 install torch torchvision
For more information, visit the PyTorch website. -
Thunder:
pip install thunder-python
pip install thunder-extraction
The training and testing data folders are available on GCP bucket: gs://uga-dsp/project3
Training datasets are provided with ground truth labeled regions for identified neurons, and testing datasets are provided without ground truth. Each downloadable dataset includes metadata (as JSON
), images (as TIFF
), and coordinates of identified neurons, also known as ROIs (as JSON
). Datasets are around 1 GB zipped and a few GBs unzipped. Visit the neurofinder repository for current download links for all datasets.
Download these files into your project directory using gsutil:
gsutil cp -r gs://uga-dsp/project3/* base_dir
- Perfroms contrast enhacement, as outlined here.
- Create a data folder containing the neurofinder data.
- Input data should follow the below structure:
data
--neurofinder.00.00.test
|--images
--neurofinder.00.01.test
|--images
...........
...........
--neurofinder.xx.xx.test
|--images
- Output data will follow the below structure:
edited_data
--neurofinder.00.00.test
|--images
--neurofinder.00.01.test
|--images
...........
...........
--neurofinder.xx.xx.test
|--images
contrast.py -d \<data directory>\ -s \<save location>\ -u \<upper_bound>\ -l \<lower_bound>\
Required parameters:
<data directory>
Path to the data folder created as per the specifictions.
Optional parameters:
<save location>
The location to save the output. Defaults to current working directory.
<upper_bound>
The upper bound for clipping. Defaults to 99.
<lower_bound>
The lower bound for clipping. Defaults to 3.
- Perfroms nmf, as outlined here.
- Follow the data format given above.
nmf.py -d \<data directory>\ -k \<num_components>\ -p \<percentile>\ -m \<max_iter>\ -o \<overlap>\
Required parameters:
<data directory>
Path to the data folder created as per the specifictions.
Optional parameters:
<num_components>
The number of components to estimate per block. Defaults to 5.
<percentile>
The value for thresholding. Defaults to 90.
<max_iter>
The maximum number of algorithm iterations. Defaults to 20.
<overlap>
The value for determining whether to merge. Defaults to 0.1.
Output:
The program will output submission.json
in the given data folder.
This project can be used as a part of a bigger study on the efficacy of new drugs on inhibiting certain types of cross-synaptic activity for the treatment of neurological disorders. With this context in mind, we have undertaken certain ethics considerations to ensure that this project cannot be misused for purposes other than the ones intended.
See the ETHICS.md file for details. Also see the Wiki Ethics page for explanations about the ethics considerations.
The master
branch of this repo is write-protected and every pull request must passes a code review before being merged.
Other than that, there are no specific guidelines for contributing.
If you see something that can be improved, please send us a pull request!
(Ordered alphabetically)
- Dhaval Bhanderi
- Hemanth Dandu
- Sumer Singh
- Yang Shi
See the CONTRIBUTORS.md file for details.
This project is licensed under the MIT License - see the LICENSE file for details