Code Monkey home page Code Monkey logo

derm7pt's Introduction

derm7pt

CircleCI

derm7pt preprocess the Seven-Point Checklist Dermatology Dataset and converts the data into a more acessible format.

derm7pt is a Python module that serves as a starting point to use the data as described in,

J. Kawahara, S. Daneshvar, G. Argenziano, and G. Hamarneh, “Seven-Point Checklist and Skin Lesion Classification using Multitask Multimodal Neural Nets,” IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 2, pp. 538–546, 2019. [pdf] [doi]

Download the Data

http://derm.cs.sfu.ca

The images and meta-data (e.g., seven-point checklist criteria, diagnosis) can be downloaded from the external site above.

The actual images and meta-data are not stored in this repo.

Minimal Example

import sys, os
import pandas as pd
sys.path.insert(0, os.path.abspath(os.path.join(os.getcwd(), '..'))) # To import derm7pt
from derm7pt.dataset import Derm7PtDatasetGroupInfrequent

# Change this line to your data directory.
dir_release = '/local-scratch/jer/data/argenziano/release_v0'

# Dataset after grouping infrequent labels.
derm_data = Derm7PtDatasetGroupInfrequent(
    dir_images=os.path.join(dir_release, 'images'), 
    metadata_df=pd.read_csv(os.path.join(dir_release, 'meta/meta.csv')), 
    train_indexes=list(pd.read_csv(os.path.join(dir_release, 'meta/train_indexes.csv'))['indexes']), 
    valid_indexes=list(pd.read_csv(os.path.join(dir_release, 'meta/valid_indexes.csv'))['indexes']), 
    test_indexes=list(pd.read_csv(os.path.join(dir_release, 'meta/test_indexes.csv'))['indexes']))

# Outputs to screen the preprocessed dataset in a Pandas format.
derm_data.df

This will group infrequent class labels together and assign numeric values to each class label.

You can see the output in this minimal example notebook.

You can find a more comprehensive example here that includes an example of how to classify some of the seven-point checklist.

Installation Instructions

You can see the dependencies and versions derm7pt was tested on here.

To use derm7pt:

  1. Download the data and unzip it to your folder (we will use the folder /local-scratch/jer/data/argenziano/release_v0 for this example)
  2. Clone this repository
  3. Run the minimal_example.py. Make sure to change the directory to match your data folder.

Steps #2 and #3 are shown below,

git clone https://github.com/jeremykawahara/derm7pt.git
cd derm7pt
python minimal_example.py '/local-scratch/jer/data/argenziano/release_v0'

This should output a view of the data that is similar to what is shown in this notebook.

Related Publications

More information about this data can be found in our publication, and if you use the data or code, please cite our work,

@article{Kawahara2018-7pt,
author = {Kawahara, Jeremy and Daneshvar, Sara and Argenziano, Giuseppe and Hamarneh, Ghassan},
doi = {10.1109/JBHI.2018.2824327},
issn = {2168-2194},
journal = {IEEE Journal of Biomedical and Health Informatics},
month = {mar},
number = {2},
pages = {538--546},
publisher = {IEEE},
title = {Seven-point checklist and skin lesion classification using multitask multimodal neural nets},
volume = {23},
year = {2019}
}

You can read more about the seven-point checklist here:

G. Argenziano, G. Fabbrocini, P. Carli, D. G. Vincenzo, E. Sammarco, and M. Delfino, “Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions. Comparison of the ABCD rule of dermatoscopy and a new 7-point checklist based on pattern analysis,” Arch. Dermatol., vol. 134, no. 12, pp. 1563–1570, 1998.

Clarifying Notes

The following notes are all related to this publication:

J. Kawahara, S. Daneshvar, G. Argenziano, and G. Hamarneh, “Seven-Point Checklist and Skin Lesion Classification using Multitask Multimodal Neural Nets,” IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 2, pp. 538–546, 2019. [pdf] [doi]

In Section B. Mini-Batches Sampled and Weighed by Label we set k=1.

This means the mini-batch has 24k = 24 samples, since there are 24 unique labels.

derm7pt's People

Contributors

jeremykawahara avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

derm7pt's Issues

checkpoint

Hello and thank you for sharing your work!
Would it be possible for you to provide the model checkpoint for inference?
Thank you in advance, Lucia

Multi-Modal Loss Function and Training Function

Hello!
I hope this message finds you well.
Your paper on Multi-Modal Learning for Skin Lesion Detection and Classification is quite impressive and would love to know more about your implementation of the loss function mentioned in the paper. Could you please provide the code for the same?
Thank you for your time!

AUCROC in paper

Hi, @jeremykawahara, @hamarneh!

Can you explain how we calculate AUCROC for diagnosis? AUCROC is for binary classification, how can we interpret this for multi-class classification (because 1 diagnosis per image)?

Moreover, the same question about AUCROC in every criteria.

Do you use just AUCROC from sklearn that can return dict with AUCROC per label and use AUCROC like binary classification to every label?

It will be very cool, if you explain this point or publish code that calculate metrics from paper.

Why 24 unique labels?

According to paper:
"As we have 24 unique labels across all categories (Table I), this constrains our mini-batches to be of size b = 24k."

I can't get 24. Could you explain more precisely how to get 24? @jeremykawahara

no module derm7pt

Hello, I have a problem with the run of the project, in particular with minimal_example file. The problem is this: no module derm7pt found. How can I solve it? Thank you in advance

Unable to access dataset

Hi Jeremy,

I am trying to access the data from the website associated to your paper here, but after filling out the form twice with my institutional and personal email address (like half an hour ago), I still have not received any email with download instructions. Could you please help me out?

Thanks :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.