Comments (2)
Wow, what a sincere content! I can find your deep consideration in this issue.
Anyway, let's talk about more and more.
First, in my opinion, I think the most reason we cannot overcome the validation AUROC score of 0.9(or more) is "-1". But it's just the prospective of "most", I agree that the mislabeling frontal/lateral is the thing we have to check.
Second, I guess, test set might not have mislabeling because both validation and test were checked by radiologits in CheXpert dataset, and the other datasets could have mislabeling. Of course, more accurately, it's the best way for us to check validation and test images.
So, I suggest a new method.
First, we double-check CheXpert valid, test dataset. Because of their size and their reliability, it's will be easy. (If we think we need to do more datasets, we can get train image!)
Second, revise mislabeling things if they exist and train binary classification model. (Frontal vs Lateral) I think the difference between them is clear not such as classification 14 labels in CheXpert.
Third, check the accuracy score and check errors.
Fourth, check MIMIC-CXR's valid or test or BRAX its. (Like a way that I said in "First".)
Fifth, inference model from "Third" to "Fourth" datasets.
Sixth, result check.
Seventh, use other datasets and train/valid/test sets and double check the result.
If we use this method, it can be need more times rather than other ways, however, we can give users to automate cleansing mislabels. (We can give pth and notebooks!)
from cxrail-dev.
Today, I checked test_labels.csv in the CheXpert test dataset from CheXlocalize dataset which was downloaded by Azure Storage Explorer. It doesn't have a 'Frontal/Lateral' column unlike train.csv and valid.csv, so I got view position values from the file names. (All of split csv files have view position value in 'Path' column in csv.) And after that, I checked the matches correctly between view position values and images one by one. Fortunately, the mislabeling values did not exist.
So, from this methodology, I checked train.csv and valid.csv in CheXpert dataset. In this situation, I couldn't check the matches between the view position values from 'Path' column and images, also it's not necessary because both train.csv and valid.csv have 'Frontal/Lateral' column in each csv file. Namely, I should compare the view position values between from the file names and 'Frontal/Lateral' columns in csv file.
These two images are the result images. The top one represents the result of train.csv and the bottom one represents the result of valid.csv. As you can see, fortunately, there are no mislabeling values in CheXpert datasets! (Of course, it's more better to double-check one by one between images and column values. However, it's not easy work for us that we have not enough time.)
And thanks for your great discussions! Also, we must check in MIMIC dataset same!
from cxrail-dev.
Related Issues (20)
- Discussion: Consider learning rate 0.01 as a default setting for the combination AUCM x PESG HOT 6
- Features: EDA for BRAX dataset
- Features & Discussion: BRAX data train/valid/test split HOT 3
- Experiment: Remove Hydra and manage with yaml and argparser
- Features: Update EDA for MIMIC
- Hotfix: Reorganize conditional_train code HOT 1
- Hotfix: Conditional training cannot use transform.py HOT 1
- Hotfix: Conflict between label smoothing and Asymmetric Loss implementation
- Features: Support multiple datasets
- Hotfix: Did not change output config.yaml HOT 11
- Hotfix: Error when trying to use Ray Tune HOT 3
- Discussion: Densenet121 tuning result and future experimental plan
- Hotfix: Doesn't work fixing seed
- Hotfix: Multi-gpu seed fix error
- Features: MIMIC CSV file concatenator HOT 1
- Hotfix: Inference data loading doesn't reflect override contents
- Features: Data parallel for more flexibility and more efficiency HOT 1
- Hotfix: MIMIC csv concatenate doesn't reflect new path
- Discussion: BRAX datasets' image crop HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cxrail-dev.