Code Monkey home page Code Monkey logo

cxrail-dev's Introduction

cxrail-dev's People

Contributors

chrstnkgn avatar hoon-hoon-tiger avatar jieonh avatar juppak avatar kdg1993 avatar seoulsky-field avatar yisakk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

cxrail-dev's Issues

Features: Make Random Augment & Asymmetric loss tunable

What

Make Random Augment & Asymmetric loss tunable

Why

To achieve a higher score, we need to tune sensitive hyperparameters.
There are far more things that can be tuned but due to the restriction of time and resources,
it would be better to tune efficient hyperparameters (which have enormous potential and cover various situations).

According to the papers (Random Augment & Asymmetric loss), there are sensitive hyperparameters.
In other words, there are hyperparameters that have potential.
(E.g. augmentation strength is sensitive to data size and model complexity)
Also, loss function and augmentation are always needed to train and data or model.

In conclusion, tuning augmentation and loss satisfy potential and coverage.

How

  • Make random augment tuneable
  • MAke asymmetric loss tunable
  • Simple test (small subset of data) with wandb

Features: Replace current train metric logger

What

  • Replace current train metric tracker to current inference metric tracker.

Why

  • We use different metric tracker to do same works between trainer and inference. For providing unity, I think it's really important and also can provide less confusion.

How

  • Use inference metric tracker to trainer.

Features: Add a model soup function

What

Make our codes support a model soup method.

Why

I think a model soup method is one of the best generalization method. I think if we support model soup method, we can provide users more reliability and availability in their experiments.

How

Can reference two papers which one is Model Soups: averaging weights of multiple fine-tuned models
improves accuracy without increasing inference time
and the other is Model Soups improve performance of dermoscopic skin cancer classifiers.

Add log analysis tool

Summary: Implement a tool to collect and summarize log files assuming that the user is running an experiment by hydra multi-run

To Do

  • Handle sub-experiments (by hydra multi-run) using dictionary type
  • Class type
  • Provide summary DataFrame
  • Show the best score and its tuned hyperparameters

Features: CLI logging using rich

What

We discussed about this problem in #19.
Change the current CLI logging using "rich" library.
Current: just use print() function and print validation score, loss, epoch, batch_id at specific batch number. And, no progress bar.

Why

When we observe the CLI results reporting during the training, we have inconvenient things such as "when is the training finished?", "I want to see the CLI logging which is more simple!".
So, based on each of experiences, we decided to change CLI results reporting!

It's really important to provide correct, convenient, easy to see and analyze CLI logging.
And I think "rich" library can meet the conditions!

How

I think we can reference rich official documents and examples.
Also, rich is used in lightning-hydra-template, so it can be referenced too.
I plan to use both "progress bar" and "status" properly.

  • Apply progress in single gpu, without RayTune
  • Support progress in single gpu, with RayTune
  • Support progress in multi gpu, with RayTune
  • Modularize with class (ex. RichProgressBar)

Features: EDA for CheXpert data

What

  • EDA for the CheXpert data
  • Especially the target label distribution and pathological aspect

Why

  • Inspired by the creative suggestion from @chrstnkgn and the good motivation of @seoulsky-field from #10, I made up my mind to further explore the CheXpert data set
  • Also, the excellent replies from @jieonh about the rank 2 CheXpert leaderboard & from @chrstnkgn about the CheXpert datasheet, make me more curious about the labeling system & feel complicated about the similarity of train and validation distributions

How

  • Explore the target label (especially the uncertain label)
  • Analyze target class converting
  • Pathological Hierarchy
  • Image aspect (future plan)
  • (New task) Add test set analysis

Features: Support more models, losses, optimizers

What

  • Support more models, losses, optimizers

Why

  • While we discussed about experiments, we decided to use three models: ResNet50, DenseNet121, Swin Transformer.
  • There is not difficult works to support models, loss,es optimizers I think.

How

  • Support Swin Transformer.
  • Implement and support focal loss.
  • Support RMSprop optimizer.
  • Support AdamW optimizer.

Features: Change the execution method and config structure of ray

What

Instead of executing ray by changing the mode config, put it in the hparams_search config and override it.
(Refer to the structure of the hydra-lighting template)

Why

  • I think this will simplify not only the config structure but also the parameter selection code in the trainval function of train.py
  • This structure will make it possible to apply hyper parameter tuning tools other than ray (ex. optuna, wandb sweep, etc.)

How

  • Remove mode config and create hparams_search config
  • Modify parameter selection structure on train.py and main.py
  • Modify ray.yaml, default.yaml config to fit the structure
  • Apply other hyperparameter tuning tools (For future. Not a priority)

Add hyperparameter choice structure

Summray: Ideas for structural design that gives users some freedom to set hyperparameters to a fixed value (by hydra config) or to use tuning (by Ray Tune)

Features: Add CheXpert train csv made by CheXbert

What

Add train.csv which made by CheXbert.


Why

We have a CheXpert csv by CheXpert labeler in CheXpert dataset but other options don't exist yet.
It can be helpful to get variable CheXpert benchmark.

How

From AIMI, they provide train_cheXbert.csv and train_visualCheXbert.csv.
I plan to use both "progress bar" and "status" properly.

  • Download both of CheXbert csvs.
  • Analysis difference between visualCheXbert and CheXbert.
  • Training and compare the results.

Add experiment logging analysis script

A script for basic analysis based on hydra multirun+ray tune log data is needed

To do

  • Show best trial learning progress
  • Get the best trial's configuration
  • Comparison between score (hydra multirun level)

Feature: Customize CLI reporter

What

Customize CLI reporter to print output at appropriate intervals. (per epoch, etc.)

Why

  • The default CLI reporter prints outputs too frequently, making it difficult to check the results.
  • Previously, I made the custom reporter (reports at end of each trial) to solved that problem, but there was an inconvenience that the reporting cycle was too long.

It is not a priority, and when I analyzed it last time, it was not as easy as I thought to change it to print in cycles rather than ray Trial units. But I still think it's worth looking into, since the need has come up several times.

How

  • Choose appropriate reporting intervals
  • Analysis CLI reporter Class
  • Customize reporter

Hotfix: Conditional training cannot use transform.py

What

If we use conditional_train option "conditional", it doesn't work (first, from transform.py)

Why

While I apply custom_metrics.py to both train.py and conditional_train.py, I found conditional_train.py didn't work well.
Because the conditional_train.py has not updated after @juppak did, some codes could not work.
Also, in the error of transform.py, it can be revised hydra_cfg to hydra_cfg.Dataset.

스크린샷 2023-01-18 오전 11 29 53


How

  • Also, it has a relation with issue #81 , I'll report to @chrstnkgn .
  • Revise conditional_train.py to work well

Discussion: Analysis of Hydra Multi-Run Operation

What

For more sophisticated benchmark experiments, there is a need to understand how Hydra actually operates multi-runs.
(Whether it finishes each trial and run the next one in a row or run several trials (through an override method or something) and then return all the results at once, etc. )

Why

Currently, there are several problems arising form the lack of understanding of the multi-run operation of Hydra.
For instance,

  • Custom logging is being recorded repeatedly as many times as the number of hydra multi-runs.
  • It is unclear how to weave end-to-end pipelines which executes train->inference at once, in the multi-run experiment.

How

  • Hydra Multi- Run Code Analysis
  • Modify custom logging method (if necessary)

Features: Create an inference python file

What

While we discussed about how we get model weights from directory when doing model soups, we noticed that we should create an inference(benchmark) python file.

Why

There are many benchmark githubs have an inference python file(or benchmark python file). Also, some of the functions we'll support need inference file.

How

Can reference validation function that is already existed in train.py.
Also, we'll reference lots of benchmark githubs from NeurIPS, MICCAI, etc.
Any opinions welcome!

Features: Subdivide Hydra log directory

What

  • Subdividing Hydra logging directory
  • The results are likely to be as follows.
    • single run: logs/train/run/{custom_exp_name}/2023-01-05_05-57-24
    • multi run: logs/train/multirun/{custom_exp_name}/{multirun-tiral-number}/2023-01-05_05-57-24

Why

Currently, the outermost logging folder name is set to the time stamp, making it inconvenient to distinguish.

How

  • Seperate multirun / run dir
  • Add custom experiment name
  • Simplify multirun subdir (override_dirname -> numbere)
  • Check for conflicts or errors due to changes

Features: WandB as an Option

What

  • Set WandB as an optional logger
  • Log CLI outputs as a log file when no logger is chosen

Why

  • Considering the released version of our repo, I figured that there might exist users who don't have any experience in WandB
  • Which means that it might be inappropriate to put WndB as a default logger

How

  • Make WandB optional
  • Make a simple logging process for the users who did not choose any logging option

Features: Apply simple early-stop in the train code

What

  • Apply simple early-stop in the train code

Why

  • Motivated by the discussion with @kdg1993 (#30), concluded that applying a simple early-stop for the trial terminator is an appropriate solution for the initial implementation
  • I think that it is an urgent issue to resolve if we are planning to conduct the experiment before this week ends as our trial terminator for ray is not working quite well

How

  • Implement a simple early-stop at the end of our train loop
  • Add a Hydra option to set the patience step for the early stop

Hotfix: Ray related and Working directory problem

What


  • We now have a simpler structure for ray tune thanks to @jieonh 's work (#27) and conditional training & label smoothing thanks to @juppak (#31), but we have several new bugs to fix.

Why


  1. Logging directory not formed in wandb.init() step
  2. conditional training logged together in wandB with the actual training process

How

  • Hotfix needed
    • I suppose that the structure of initializing wandb should be modified
    • After fixing the bug I am planning to explain what caused the problem and what I've changed for better understandings for all members

Discussion: How to save pytorch model weights in each sub experiments

What

  • Discuss how to save pytorch model weights in each sub experiment in hydra multirun + no ray tune setting
  • I guess we need to discuss the thing we need to log, log file names, and logging structure maybe

Why

  • There are 4 different cases ( [hydra multirun on/off] X [ray tune on/off] ) when we do experiments with our custom code set
  • I figured out that under hydra multirun + no ray tune setting, only one best model is saved (not sure for the other person's environment but for me) like below

image
./logs
└── 2022-12-08_02-12-25
├── best_saved.pth
├── epochs=2,loss=BCE,mode=default,model=mobilenetv3_small_050,num_samples=2,optimizer=adam
│ └── main_Tuner.log
├── epochs=2,loss=BCE,mode=default,model=tinynet_e,num_samples=2,optimizer=adam
│ └── main_Tuner.log
├── epochs=2,loss=Multi_Soft_Margin,mode=default,model=mobilenetv3_small_050,num_samples=2,optimizer=adam
│ └── main_Tuner.log
├── epochs=2,loss=Multi_Soft_Margin,mode=default,model=tinynet_e,num_samples=2,optimizer=adam
│ └── main_Tuner.log
└── multirun.yaml

How

  • #14
  • Apply to the code

Features: Implement augmentation

What

Implement user-friendly & basic image augmentation

Why

  • Augmentation is one of the important parts of DL but we do not have many options in our code set, so far
    • Current augmentation which is based on CheXpert leaderboard is quite a good choice but we need more for covering MIMIC, BRAX, and the others
  • As a benchmark test bed, persuasive augmentation options will reduce the user's experimental burden
  • Good augmentation can contribute to better performance

How

  • Search implementable strategies
  • torchvision auto-augmentation Implementation
  • torchvision auto-augmentation code test
  • albumentations auto-augmentation Implementation (future work)
  • albumentations auto-augmentation code test (future work)
  • RandAugment implementation
  • RandAugment code test
  • Make an augmentation result visualization codes

Hotfix: Minor bugs and working directory problem

What

  • CLI reporter not being printed in hydra multirun settings (from the second run)
  • Change the working directory for hydra multirun without Ray

Why

CLI Reporter

  • Not a MAJOR problem, but it hinders monitoring which was quite annoying

Working directory

  • Refer to: #13

How

  • Check if there were any conflicts regarding Ray reporter and hydra settings
  • Change the working directory for single run
  • Change the working directory for hydra multi-run

Hotfix: Inspect all of codes

What

  • Inspect and revise all of train, utils, inference codes.

Why

There are some codes set default settings such as num_classes = 5. Before experiments and adding more functions, or hyper-parameter tunings.
More, if the file which didn't apply pre-commit exists, it would be applied in this issue.
In addition, if it's more reasonable to use name "utils" rather than "custom_utils", it can be changed in this issue.

Notes: This issue doesn't have relation between working well currently. The purpose of this issue is "checking the codes", not "working well about all of options".

How

  • Inspect & Revise trainer
  • Inspect & Revise inference
  • Inspect & Revise custom_utils

Features: Configure Hydra config directory and files

What

  • Change Hydra config format neater and more straightforward

Why

  • As the project proceeds, the parameters that we need to handle are getting broader and more complicated
  • We have discussed how to change our config format several times during meeting

How

(Edited) Found the need of dividing the work into small portions for better understanding of the team members and following the flow of fast-merging branches. Therefore, I will Comment my work to this issue one by one and publish pull requests based on the comments

Hotfix: Error while using both mode=raytune & logging=wandb

Intro

  • At first, I am not certain whether this modes selection (using raytune&wandb simultaneously) is not implemented yet or not
    I'm really sorry if I misunderstood the progress of this work
  • Secondly, it probably occurred due to the specific environment of my docker container

Circumstance

  • python main.py model=tinynet_e epochs=3 num_samples=2 Dataset.train_size=0.2 logging=wandb mode=raytune
  • Branch : develop (commit 6b6f63d)
  • Code change : No

Error

mode=raytune
working dir: /home/CheXpert_code/kdg/CXRAIL-dev
[2022-12-21 01:57:23,674][ray.tune.tune][INFO] - Initializing Ray automatically.For cluster usage or custom Ray initialization, call ray.init(...) before tune.run.
2022-12-21 01:57:26,066 INFO worker.py:1529 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
Error executing job with overrides: ['model=tinynet_e', 'epochs=3', 'num_samples=2', 'Dataset.train_size=0.2', 'logging=wandb', 'mode=raytune']
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/ray/tune/tuner.py", line 272, in fit
return self._local_tuner.fit()
File "/usr/local/lib/python3.8/site-packages/ray/tune/impl/tuner_internal.py", line 420, in fit
analysis = self._fit_internal(trainable, param_space)
File "/usr/local/lib/python3.8/site-packages/ray/tune/impl/tuner_internal.py", line 532, in _fit_internal
analysis = run(
File "/usr/local/lib/python3.8/site-packages/ray/tune/tune.py", line 626, in run
callbacks = _create_default_callbacks(
File "/usr/local/lib/python3.8/site-packages/ray/tune/utils/callback.py", line 105, in _create_default_callbacks
callbacks.append(TBXLoggerCallback())
File "/usr/local/lib/python3.8/site-packages/ray/tune/logger/tensorboardx.py", line 165, in init
from tensorboardX import SummaryWriter
File "/usr/local/lib/python3.8/site-packages/tensorboardX/init.py", line 5, in
from .torchvis import TorchVis
File "/usr/local/lib/python3.8/site-packages/tensorboardX/torchvis.py", line 10, in
from .writer import SummaryWriter
File "/usr/local/lib/python3.8/site-packages/tensorboardX/writer.py", line 16, in
from .comet_utils import CometLogger
File "/usr/local/lib/python3.8/site-packages/tensorboardX/comet_utils.py", line 7, in
from .summary import _clean_tag
File "/usr/local/lib/python3.8/site-packages/tensorboardX/summary.py", line 12, in
from .proto.summary_pb2 import Summary
File "/usr/local/lib/python3.8/site-packages/tensorboardX/proto/summary_pb2.py", line 16, in
from tensorboardX.proto import tensor_pb2 as tensorboardX_dot_proto_dot_tensor__pb2
File "/usr/local/lib/python3.8/site-packages/tensorboardX/proto/tensor_pb2.py", line 16, in
from tensorboardX.proto import resource_handle_pb2 as tensorboardX_dot_proto_dot_resource__handle__pb2
File "/usr/local/lib/python3.8/site-packages/tensorboardX/proto/resource_handle_pb2.py", line 36, in
_descriptor.FieldDescriptor(
File "/usr/local/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 560, in new
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:

  1. Downgrade the protobuf package to 3.20.x or lower.
  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "main.py", line 94, in main
raytune(hydra_cfg)
File "main.py", line 61, in raytune
analysis = tuner.fit()
File "/usr/local/lib/python3.8/site-packages/ray/tune/tuner.py", line 274, in fit
raise TuneError(
ray.tune.error.TuneError: The Ray Tune run failed. Please inspect the previous error messages for a cause. After fixing the issue, you can restart the run from scratch or continue this run. To continue this run, you can use tuner = Tuner.restore("/home/CheXpert_code/kdg/CXRAIL-dev/logs/2022-12-21_01-57-23/Dataset.train_size=0.2,epochs=3,logging=wandb,mode=raytune,model=tinynet_e,num_samples=2/trainval_2022-12-21_01-57-23").

Suspected reason

  • Python version and dependency conflict

Related to

Hotfix: Too long file name raises OSError

What

The command is

python main.py --multirun model=resnet,densenet
logging=wandb project_name='aug_efficacy'
logging.setup.name='augmentation_efficacy_test'
conditional_train=none
Dataset.augmentation_mode="auto","random","custom"
hparams_search=raytune
hparams_search.tune_config.num_samples=10
hparams_search.tune_config.scheduler.grace_period=100000
hparams_search.param_space.lr.lower=1e-5
hparams_search.param_space.batch_size.categories=[32,64]

The raised error is
image

[2023-01-03 05:39:43,143][HYDRA] Launching 6 jobs locally
[2023-01-03 05:39:43,143][HYDRA] #0 : model=resnet logging=wandb project_name=aug_efficacy logging.setup.name=augmentation_efficacy_test conditional_train=none Dataset.augmentation_mode=auto hparams_search=raytune hparams_search.tune_config.num_samples=10 hparams_search.tune_config.scheduler.grace_period=100000 hparams_search.param_space.lr.lower=1e-05
Traceback (most recent call last):
File "/usr/local/lib/python3.8/pathlib.py", line 1288, in mkdir
self._accessor.mkdir(self, mode)
OSError: [Errno 36] File name too long: 'logs/train/2023-01-03_05-39-41/Dataset.augmentation_mode=auto,conditional_train=none,hparams_search.param_space.lr.lower=1e-05,hparams_search.tune_config.num_samples=10,hparams_search.tune_config.scheduler.grace_period=100000,hparams_search=raytune,logging.setup.name=augmentation_efficacy_test,logging=wandb,model=resnet,project_name=aug_efficacy'

Why

Too long file name raises OSError because the directory name while mkdir

How

Need to find a solution by discussion

Features: Apply WandB logger in Ray with the same format as default running

What


  • Apply WandB logger in Ray with the same format as default running

Why


  • It is hard to customize logging configs and outputs using WandbLoggerCallback from ray
  • We would want to simplify logging outputs in ray-WandB to make them in line with those from default running (w/o ray tune)

How


  • Try customizing setup_wandb function instead of using WandbLoggerCallback from Ray

Features: Do EDA MIMIC-CXR

What

Do EDA(Exploratory Data Analysis) about MIMIC-CXR!

Why

It's necessary to apply MIMIC-CXR in our codes.
Especially, we discussed about difference between CheXpert csv and MIMIC-CXR csv.
In EDA, I focus on the AP/PA conclusion!

How

I'll upload an EDA notebook in notebook directory.
An EDA notebook mainly includes AP/PA conclusion and labels.

Features: Asking for help to add new policies to convert CheXpert target class in our custom Dataset Class

What

  • Add more policy options based on statistical or intuitive aspects of missing and label converting (Not based on domain knowledge or score)

Why

While I've looked around the target class distribution of CheXpert CSV data, I found an interesting possibility for data handling.
The figure below is a snapshot of target distribution by my personal exploration of CheXpert.

image

Meanwhile, our current custom Dataset class converts, (Not sure but I guess this way of converting is based on score)

Nan -> 0
-1 -> 1 ( if the target is 'Edema' or 'Atelectasis' )
-1 -> 0 ( if the target is neither 'Edema' nor 'Atelectasis')

  • In my opinion, converting Nan to 0 is acceptable because 'nothing' often means False (0). Thus, the thing is converting the '-1'
  • In train set, 11 disease columns among 14 have more labels 1 than 0. Thus, converting -1 to 1 also makes sense to me
  • In line with distribution-based thinking, converting -1 by random sampling from the total set of 0 and 1 could also be an interesting approach

Likewise, I think there are many ways to apply statistical or intuitive aspects of handling missing values in the traditional ML field. So, I want to discuss it and carefully ask for help to make this idea possible to use in our custom codes

FYI, I include the distribution of valid set just for sharing knowledge but I'm afraid that considering the validation set distribution might be connected to the data leakage issue. Probably everyone knows already but mentioned it just for reminding 😄

How

  • Any kind of interesting idea can be an option
  • My simple idea now is to consider the major class for converting candidate or random sampling

Hotfix: Change the order of train_size in the preprocessing sequence

What

Changing the order
from restrict train_size by sampling -> frontal or lateral restriction -> enhancement
to frontal or lateral restriction -> enhancement -> restrict train_size by sampling

Why

So far, the training data size restriction has been done in the early stage of data preprocessing.
However, the current process returns fewer datasets than a given integer or float (thanks for noticing @seoulsky-field).
For example, if you set train_size as 100 & use_frontal as True, the codeset samples 100 data and selects frontal images.
Thus, it returns <= 100 images.
To avoid this, I checked dataset options that affect the number of datasets and
figured out use_frontal & enhancement (upsampling) can reduce or increase the number.

While analyzing the effects of these processing options,
I figured out that enhancement is quite complicated and might return a result that far different from what the user expected.
Currently, the enhancement accepts multiple target columns and n_times (the amount of upsampling).
Since this enhancement works the target column independently (which means that does not consider co-effect),
it duplicates more than given n_times due to the inherent trait of multi-label problem.

Here is a really simple example of the enhancing sequence in our codeset.
original (3A, 4B) -> Enhancing 'A' 2-times (6A, 6B) -> Enhancing 'B' 2-times (8A, 10B) <- more than 2-times of 'A' and 'B'
A B           A B              A B
1 0           1 0               1 0
1 1           1 1               1 1
1 1           1 1               1 1
0 1           0 1               0 1
0 1           0 1               0 1


           1 0               1 0
           1 1               1 1
           1 1               1 1


                           1 1
                           1 1
                           0 1
                           0 1


It is difficult to determine which way of enhancing (upsampling) is right, but we should definitely recognize this.

How

  • Code change
  • Test the length of returning dataset (length of self.df)

Discussion: Alternatives for ASHAscheduler in Ray

What

  • Find alternatives for ASHA scheduler
    • Possible integrations that I can think of now are:
    1. Apply early stop with certain number of patiences
    2. Find another scheduler that ray provides that better fits our need

Why

  • While developing and enhancing this project, we have discussed about the scheduler that terminates the training process multiple times and the main issue was that 'ASHA scheduler does not fit our purpose' as it is the algorithm that works well in the multi-processing environment.
  • If we have firmly decided to stick with ray tune, then we need to seek algorithms that better fulfill our need and terminate the process at the appropriate timing

How

  • There seem to be several options that we can consider according to the ray docs (https://docs.ray.io/en/latest/tune/api_docs/schedulers.html), but if any of them does not seem appropriate, then it might be better to just go with early stopping
  • I don't think that it is the part that I can decide alone, so I kindly ask you to freely discuss and provide various opinions here! 🙏

Feature: Refine Hyperparameter Tuning

What

Overall parameter tuning is required when finalizing the benchmark design. In order to provide detailed optimized tuning results for each tasks like retina benchmark, it is necessary to refine current hyper parameter tuning structure.

Why

If hyperparameter tuning is going to performed throughout the code in addition to current basic config tuning (lr, batch_size, etc.), there are some parts that need to be changed in the current structure.
The following areas might be considered:

  1. Parameters that included only in specific cases

    • ex) gamma_neg, gamma_pos in AssymetricLoss
  2. Some tuning results might vary depending on the combination

    • ex) best learning rate for each model architecture
      • DenseNet : 1e-4, ResNet: 1e-5
  3. Currently, all parameters are included in the ray tune config - param_space, but this part needs to be divided in detail.
    ex)

    • gamma_neg, gamma_pos -> AssymetricLoss config
    • lr, weight_decay, betas, eps -> Optimizer config
    • batch_size, seed -> Experimental setting config

ref: retina_benchmark

How

  • Include ASL configs in searchspace -> To work only when using ASLoss

(The part below is still in the process of planning)

  • [python code] Modify hyper parameter selecting structure (in train.py- trainval, main.py-defaul,raytune )
  • [hydra yaml config] Refinement search space structure

Features: WandB logging part as a Hydra option

What

To Add

  • Change WandB logging part as a Hydra option to:
    • make WandB logger able to be turned on/off
    • automatically assign experiment name when running the script
    • Add option to use WandB when running the experiment without Ray

Why

  • Initially, I added a WandB logging option supported by Ray to keep track of the experimental results
  • Then found out that it would be nicer (in terms of both convenience and code clarity) to make WandB optional regardless of the usage of Ray

How

  • WandB option into Hydra, default setting: ON
  • WandB option when not using ray
  • Come up with WandB convention (How to set the project name?) -> Any opinion or discussion would be appreciated

Discussion&Hotfix: RayTune values are always random.

What

  • Seed fixing doesn't be applied in RayTune.

Why

  • I think seed fixing is also worked in RayTune. However, as you see below two results images, you could see that seed fixing isn't worked in RayTune. (No options changed, no codes changed)
    image
    스크린샷 2023-01-05 오후 5 43 40
  • Because of the perspective of reproducibility, I think we should fix seed in RayTune.

How

  • Will be decided by discussion.

Features: Ray result analysis tool

What

Create a tool to extract information about best results among several trials of Ray (Tuner.fit())

Why

Simplicity of organizing experimental results

  • As the number of trials increases, it is difficult to analyze all trials, and ultimately, the reason why the user uses the hyperparameter tuning tool is to find the best result.

How

+) This work is almost complete and will be merged with #35 without creating a separate branch since it is an issue directly related to #35

Features: Metrics and CLI in inference.py file

What

  • Apply rich progress bar in inference.py file
  • Append more metrics in inference.py file

Why

In the train.py, we use rich to report in CLI, so I think it looks good to use rich in inference.py, too.
Also, AUROC score is used to metric score generally in medical task, however, we thought it's good to support more details and more metrics.

How

  • Apply rich progress bar in inference.py file.
  • Implement to get FPR, TPR, Best threshold.
  • Plot ROC curve with details.
  • Implement more metrics. (e.g. AUPRC, F1-score, Accuracy, etc.)

Hotfix: Changing num_samples in default config doesn't work

What

Changing num_samples in default config doesn't work but changing hparams_search.tune_config.num_samples is working.

Changing default config [The case which is not working]
image

Changing hparams_search config [The case which is working]
image


Why

Since the default config should be the file that has the most powerful authority, it should be fixed.

How

Need help to fix

Hotfix: Reorganize conditional_train code

What

Reorganize conditional train code, and simplify its config

Why

  • We are aware that we should somehow modify the code for conditional training, but it has been delayed as it was not a priority.
  • However, as we are preparing for our first release, neater (and well-working) code is needed.
  • This issue will cover following parts:
    1. (conditional_train.py) Merge train and trainval function - this format is not necessary anymore as we are not using ray tune for conditional learning, and this will resolve some error that was caused by its complicated format.
    2. (conditiona_train.yaml) Modify its format - As this is not a major option that the users will always take into account, it would be better to place this config somewhere in the lower level. Trimming some unnecessary configs is also needed.

How

  • conditional_train.py
    • merge train and trainval
    • remove unnecessary codes
  • conditional_train.yaml
    • Modify format

Features: Implement Optuna instead of Ray Tune

What

Implement Optuna instead of Ray Tune

Why

Ray is definitely a good hyperparameter tuning tool, but many problems have been discussed when using ray tune and hydra together so far. Also, if we are going to do just simple tuning, it may be better to use other tools such as Optuna instead of advanced ray. Therefore, I think it is worth to apply Optuna instead of Ray Tune and compare the two pipelines.

How

  • Implement Optuna
  • Compare two pipelines in terms of complexity, convenience, etc.

Hotfix: Raytune + wandb logging is not working

What

I tried to test single run + raytune + wandb logging but I figured out wandb logging is not working

My command is

python main.py model=resnet logging=wandb project_name='kdg_dev_test' logging.setup.name='autoaug_ray_test' conditional_train=none Dataset.auto_augmentation=True hparams_search=raytune hparams_search.tune_config.num_samples=10

I also tried

  • Assigning both : project_name & logging.setup.project
  • Assigning alone : logging.setup.project

However, I got the same result from all trials

While I tried to find the reason I found "-" in run_config of raytune.yaml
image

I guess that is a typo, thus I removed and tried again. And then, a difficult error occurred

SUCCESS 12345 SEED FIXING
hyperparameter search: raytune
working dir: /home/CheXpert_code/kdg/CXRAIL-dev
[2022-12-28 04:48:27,817][ray.tune.tune][INFO] - Initializing Ray automatically.For cluster usage or custom Ray initialization, call ray.init(...) before tune.run.
2022-12-28 04:48:31,236 INFO worker.py:1529 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
Error executing job with overrides: ['model=resnet', 'logging=wandb', 'project_name=kdg_dev_test', 'logging.setup.name=autoaug_ray_test', 'conditional_train=none', 'Dataset.auto_augmentation=True', 'hparams_search=raytune', 'hparams_search.tune_config.num_samples=10']
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/ray/tune/tuner.py", line 272, in fit
return self._local_tuner.fit()
File "/usr/local/lib/python3.8/site-packages/ray/tune/impl/tuner_internal.py", line 420, in fit
analysis = self._fit_internal(trainable, param_space)
File "/usr/local/lib/python3.8/site-packages/ray/tune/impl/tuner_internal.py", line 532, in _fit_internal
analysis = run(
File "/usr/local/lib/python3.8/site-packages/ray/tune/tune.py", line 626, in run
callbacks = _create_default_callbacks(
File "/usr/local/lib/python3.8/site-packages/ray/tune/utils/callback.py", line 59, in _create_default_callbacks
has_trial_progress_callback = any(
TypeError: 'WandbLoggerCallback' object is not iterable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "main.py", line 85, in main
raytune(hydra_cfg)
File "main.py", line 63, in raytune
analysis = tuner.fit()
File "/usr/local/lib/python3.8/site-packages/ray/tune/tuner.py", line 274, in fit
raise TuneError(
ray.tune.error.TuneError: The Ray Tune run failed. Please inspect the previous error messages for a cause. After fixing the issue, you can restart the run from scratch or continue this run. To continue this run, you can use tuner = Tuner.restore("/home/CheXpert_code/kdg/CXRAIL-dev/logs/2022-12-28_04-48-27/Dataset.auto_augmentation=True,conditional_train=none,hparams_search.tune_config.num_samples=10,hparams_search=raytune,logging.setup.name=autoaug_ray_test,logging=wandb,model=resnet,project_name=kdg_dev_test/trainval_2022-12-28_04-48-27").
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
image

Since instantiation of run_config looks pretty clear and right to me, it is hard to find out why the wandbloggercallback has wrong type

If anyone knows or experienced this type of error, please help me to overcome it

Summary:

  • Single run + raytune + wandb logging is not working in my environment
  • I suspect a typo in raytune.yaml
  • Unknown type error by wandbloggercallback occurred

How

Features: Append More Options on CXR dataloader

What

  • The more I looked at previous work on CheXpert, such as Issue #9, I saw that some options needed to be added.
    1. Label Smoothing
    2. Conditional Training

Why

  • Lank 2 paper (https://arxiv.org/abs/1911.06475) use conditional training for tackle the reason that diagnoses are often conditioned upon their parent labels, and use label smoothing for tackle the uncertain data in dataset.
  • Also Lank 1 paper (https://arxiv.org/abs/2012.03173) use label smoothing. (not yet sure about conditional training)

How

  • Will Implement Label Smoothing Option on CheXpert dataloader
  • Will Implement Conditional Training Option on CheXpert dataloader

Comment

  • Maybe implementing on our code the option 'Conditional Training', will touch the train & valid part. 😨😱

Features: Inference logging advancement

What

  • Organize test results into a csv file

ref: timm benchmark result
(https://github.com/rwightman/pytorch-image-models/blob/main/results/benchmark-infer-amp-nchw-pt111-cu113-rtx3090.csv)

  • Save important informations other than test score (model, dataset, optimizer, etc.)

Why

Currently, only the aucroc scores of the test dataset are logged, but it is necessary to organize the inference results well for a benchmark experiment.

How

  • Extracting necessary information from hydra logging of the training results to be inferenced
  • Organize the inference result storage path
  • Save the results as csv file
  • Add other metrics (if possible. It might need to be an independent issue.)

Discussion: Better ways to improve team-wide understanding of MIMIC datasets

What

  • Discuss important things about MIMIC dataset that the whole team should know
  • Propose formats to analyze and share important points about MIMIC

Why

  • MIMIC has a more complicated data structure than CheXpert
  • It was one of the key issues of last week's meeting
  • Team-wide common reference can improve the efficiency of conversation in meeting

How

  • My simple suggestion is to make a notebook file for EDA MIMIC. Any further suggestions would be very helpful and appreciated!

Features: Multi-GPU training for non-ray setting

What

Implement multi-GPU training for non-ray tune setting

Why

While ray has a built-in parallelization system by the tune.resources functions,
there is no built-in parallel GPU training system for hydra multirun.

Thus, we need to implement parallel GPU setting for hydra multirun + non-ray tune setting, especially for large scale experiments

How

I planned pytorch's nn.parallel.DistributedDataParallel after reading references below

To Do

  • nn.DataParallel implementation
  • Deprecate nn.DataParallel and implement nn.parallel.DistributedDataParallel

Features: Apply torch.amp

What

Apply torch.amp to do experiments faster!

Why

AMP is a tool that use both float16 and float32 that can make code execution faster.
So, I think it can make our experiments more efficient and more comfortable.

How

Can reference from PyTorch Image Models(https://github.com/rwightman/pytorch-image-models).
However, it used apex amp that no longer supported because of torch.amp.
So, I reference both PyTorch Image Models and PyTorch documents.

Features: Brief instructions on how to merge MIMIC csv files

What

Make a brief instructions on how to merge MIMIC csv files

Why

  • Unlike CheXpert dataset, MIMIC has 3 different types of csv files (metadata, split, disease info)
  • Csv files seem to have ERD based structure
  • I found a small difference between foreign keys between metadata and the disease info file which is not critical but we need to know
  • One might think that it should be included in the EDA. I agree with that opinion but after I dug into this issue, I found that it might be out of the scope of conventional Kaggle-style EDA.
  • I hope this issue helps to unify the whole team's MIMIC disease information dataset and lessen the burden of EDA

How

  • Make a notebook file and upload it (Make a new branch and merge it to tutorials branch & in the tutorials/MIMIC directory)
  • Merge to the tutorials branch

Features: Implement training data size selection option

What

Implement training data size selection option for our custom dataset class

Why

  • For an experimental aspect, it could help to experiment with the necessary amount for metric saturation
  • It is also helpful for debugging because of reduced code running time

How

  • Given that the total training data size is not small and the difficulty of stratified sampling of the multilabel targets,
    random sampling is considered for the first implementation strategy
  • Expected input is an integer (for a certain number of samples) or a float in the range 0~1 (for the ratio of total size)
  • Implementation
  • Code test

Features: Support saliency method: Grad-CAM

What

  • Add a function which shows the saliency maps.
  • First, we support Grad-CAM because it's a representative method.
  • Save saliency maps with wandb option.

Why

  • From the perspective of aid to medical doctors, nowadays, Grad-CAM is one of the most used methods.
  • Even though, unfortunately, it's not a verified method in medical task from the perspective of XAI in deep-learning, many corporations still use saliency maps for medical doctors. So, I determined to provide saliency maps for users who are the medical doctors and researchers.
  • Among the lots of saliency methods, I chose Grad-CAM first by referring to the paper "Benchmarking saliency methods for chest X-ray interpretation".

How

  • Implement to support Grad-CAM.
  • Save saliency maps in local.
  • Save saliency maps in wandb.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.