project-monai / research-contributions Goto Github PK

View Code? Open in Web Editor NEW

986.0 14.0 324.0 24.22 MB

Implementations of recent research prototypes/demonstrations using MONAI.

Home Page: https://monai.io/

License: Apache License 2.0

Python 64.18% Shell 1.59% Jupyter Notebook 34.22%

monai monai-components paper

research-contributions's People

Contributors

Stargazers

Watchers

Forkers

wentaozhu chifang ahatamiz ma-zhuang siyuan89 simadidari ssoniya knectt oliveiramc qqq-tech kimx3966 splend1d hsouporto quocviet91 ikboljon ramtin-mojtahedi vignesh294 cqlouis dongstar5 austintapp locchio joyce-z-cloud leo8071 wuwan-tong april6607 introai-termproject lyingflatddd tchangtc xiaotaoxinge abrilcf seckers utayao songeeu ltumorsegmentationtest nazib high426 wmwise ycqbest yankaichen0308 duweidai liujin0 austinyuao zhouzoey junjie2008v paperwave alexander-jing yanwu-ge chaitanya-kaul vo1can yoannpitarch rylezhou wzhangneu felixquinton1 jasonzhan715 chuann zsddd fanrz linliwen88 masquer777 ziyan-huang bbsun loneblues bbqtime rio98 cho-ming zhechen1999 z9z9y wonjunpark 1419786491 cv-ip alexanderthieme zhuofalin webberxie codwest wangzhenbo123 mougaiwudi hester12138 lamawmouk ysl2 jrr1999 zijiandu chikeee liyihao76 davilla7 wwn2021 yli192 toska12138 shengzhang90 mr-mainak wxnzxt jianningli shigen97 sineagles xp1902 duan186 dayu178 rlaxmi024 winnerziqi ljm198134 lhd990120

research-contributions's Issues

Validation Accuracy on UNETR and Swin UNETR

Hi,
Thank you for sharing our wonderful work. Could you please provide some clarity on the evaluation accuracy on the BTCV validation dataset.

For UNETR when the pretrained models provided are evaluated on the BTCV validation set, the accuracy is 77.64. While, the testing accuracy reported in paper is 85.3. Is this huge gap expected and is it because of the difference in validation and test datasets?
For SWIN UNETR, when the pretrained models are evaluated on the BTCV validation set, the accuracy is 81.56. The table given in the Swin UNETR README report 81.86 as compared to 91.8 test accuracy reported in paper. Are the numbers in the table in README the validation accuracy?

Thank you.

test.py model_dict = torch.load(pretrained_pth)["state_dict"] gives error

Describe the bug
When running test.py I am getting the error:

Traceback (most recent call last):
File "test.py", line 115, in
main()
File "test.py", line 76, in main
model_dict = torch.load(pretrained_pth)["state_dict"]
KeyError: 'state_dict

To Reproduce
Steps to reproduce the behavior:

Go to:
.../swin-unetr/research-contributions/SwinUNETR/BTCV
I installed monai from swin-unetr repo following the repo readme.
Run commands

conda activate <my MONAI swin-unetr environment>
export CUDA_VISIBLE_DEVICES=<my_device_number>
python test.py

Expected behavior
Test script to run and predict validation cases in my json file without errors. My json file was set up according to your guidlines and works fine for training.

Environment (please complete the following information):

Operating System: Ubuntu 20.04.4 LTS
Kernel: Linux 5.13.0-51-generic
Architecture: x86-64

Python version

python 3.7.13

MONAI version [e.g. git commit hash]

monai 0.9.0rc1+31.g38b7943a pypi_0 pypi

CUDA/cuDNN version

pytorch 1.11.0 py3.7_cuda11.3_cudnn8.2.0_0 pytorch
cudatoolkit 11.3.1 h2bc3f7f_2

GPU models and configuration
NVIDIA A40
NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6
or
NVIDIA A6000
NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4

In doubt exactly what you need for GPU configuration. Not an expert in this. Let me know if you need additional info.

Additional context
During debugging I found out that if I simply change line 76 from
model_dict = torch.load(pretrained_pth)["state_dict"]
to
model_dict = torch.load(pretrained_pth)

The code and prediction seems to run as expected.

Let me know if you need further information.

Paameter setting for MSD pancreas and pancreatic cancer detection

Hi ALi Hatamizadeh,
It is intersting repository for 3D medical image segmentation. I have trained the UNETR model for MSD pancreas and pancreatic cancer segmentation and got average dice coefficient =0.62. I have taken space = (1.5, 1.5, 0.8) and roi = (96x96x96).
Can you provide the parameter setting like space, roi, scale intensity, hidden layer size, feature size, etc.
I would be very happy if you I could answer my questions.
Shyama

Ask for the code for running Swin UNETR on BraTS dataset

Hi, thank you for your great work.

In the paper, you said that you train and evaluate the Swin UNETR model on the BraTS 2021. I see in the repo has 2 folders BTCV and BRATS2021 but the BRATS2021 is empty. Would you release code for running Swin UNETR on this dataset?

While waiting for the release, I am trying to modify the BTCV code to run on the BraTS2021. Would you give me some instructions to do this efficiently?

Thank you.
Tan Thin.

The Implementation of Residual Block in Swin Unetr

Hi! thank you very much for you great work!
The Residual Block described in the Swin Unetr paper seems to reduce the channel dimension, but generally the Residual Block in the vanilla ResNet will not change the channel dimension. So I have a question regarding how you implemented the Residual Block in Swin Unetr. Is there a 1^1^1 conv layer added after the general Residual Block? I am a beginner of deep learning and I would appreciate it if you could answer my question.

OOM error with unetr validation when spacing is set at 1.0

Describe the bug
A clear and concise description of what the bug is.
OOM at validation for unetr repository when spacing is set at 1mm

To Reproduce
Steps to reproduce the behavior:
Run the following command for training
python main.py --data_dir=./research-contributions/UNETR/BTCV/dataset/btcv/ --distributed --out_channels=14 --feature_size=16 --batch_size=2 --logdir=benchmark --optim_lr=4e-4 --lrschedule=warmup_cosine --infer_overlap=0.5 --max_epochs=5000 --save_checkpoint --space_x=1.0 --space_y=1.0 --space_z=1.0

Expected behavior
If the model fits the vram initially during training, no OOM error during validation

Screenshots

Environment (please complete the following information):

OS Ubuntu 20.04
Python version 3.6.10
MONAI version 0.8.0
CUDA/cuDNN version 11.0/8.0.4
Also tested on python 3.8.5, MONAI 0.7.0, CUDA/CuDNN 11.2/8.1.0, same issue
GPU models and configuration Tesla V100 16GB DGX-1

Additional context
Trying to replicate the experiment mentioned in the unetr paper on BTCV dataset where the isotropic voxel spacing is 1.0 mm. However, when spacing was set at 1 for both x,y and z, there is always OOM error at the first validation (end of epoch99).

TypeError:init() got an unexpected keyword argument 'presistent_workers'

Describe the bug
I use the UNETR/BTCV code to complete multi organ segmentation, but in Dataloader, I get this Error:
TypeError:init() got an unexpected keyword argument 'presistent_workers'. It exist in .../monai/data/dataloader.py, line 87, in init **kwargs

To Reproduce
Steps to reproduce the behavior:

Go to 'UNETR/BTCV'
Install 'monai==0.7.0 nibabel==3.1.1 tqdm==4.59.0 einops==0.3.2 tensorboardx==2.1'
Run commands 'python main.py
--batch_size=1
--logdir=unetr_pretrained
--optim_lr=1e-4
--lrschedule=warmup_cosine
--infer_overlap=0.5
--save_checkpoint
--data_dir=/dataset/dataset0/
--pretrained_dir='./pretrained_models/'
--pretrained_model_name='UNETR_model_best_acc.pth'
--resume_ckpt'

Expected behavior
Start the train correctly.

Screenshots

Environment (please complete the following information):

OS ubuntu16.04
Python version Python 3.6
MONAI version [e.g. git commit hash] 0.7.0
CUDA/cuDNN version 10.2
GPU models and configuration Geforce Nvidia RTX 2080ti 11G

Additional context
Add any other context about the problem here.

The accuracy between UNETR and Pre-trained Swin-UNETR

Dear Authors,

Thanks so much for the great work! I found these two papers from your group are quite interesting:
[1]. UNETR: Transformers for 3D Medical Image Segmentation
[2]. Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis

So, when I looked at the accuracy from Table I of [2] for both UNETR and Swin UNETR. It suggested that the accuracy increasing is pretty minor and some organs have excately the same DSC accuracy. For example, Rkid: 0.942, Lkid: 0.954, Aor: 0.948. It also suggested that the accuracy increasing are from Veins, Pan and AGs.

So, I downloaded the segmentations you have submitted to the BTCV website, named "UNETR_newVer1.zip" for UNETR and "3D_SSL_pretrain_swinTransformer_and_SwinUNETR_v2.zip" for pre-trained Swin-UNETR. Please correct me if I downloaded the wrong ones.

I did a DSC calculation for these two submissions for the 20 test sets. Interestingly, most of the DSC between those two submissions are 1. DSC of 1 means that the two organs are identical and requires pixel-leve match. So my question is how would this be possible given the segmentations are calculated from different models even there are 10-ensemble for the two different models?
I am looking forward to your reply and I appologize if I missed something.

Thanks !

I have a question about coplenet implementation.

Thank you for sharing your code!
And I have a question about coplenet implementation.

CopleNet is definitely a 2D unet model that uses conv2d. However, the code clearly contains 3d input (nii data), and uses the sliding window function. Does the sliding window function automatically process 3d input even though the model is 2d based model??

p.s. there is no train code. could you share the train code?

Applying Swin UNETR to a new segmentation task

Hi,

Thank you for the great work and open source code!

I am trying to use the Swin UNETR for my own segmentation task consisting of T2 brain images for segmentation into 9 labels. I would like to train a Swin UNETR on my own data and then infer on my test data. I am working with the BTCV repository as it seemed to have the structure that is required for running such a training and inference.

I put my training/validation/test data in the same folder structure as is required by the decathlon dataset with a dataset.json file describing the data (imagesTr, imagesTs, labelsTr, labelsTs consisting of .nii.gz files). I then launch python main.py --data_dir DATA_DIR --json_list dataset.json --roi_x 32 --roi_y 32 --roi_z 32 --batch_size 2 and get the following error which I do not quite understand how to fix.

ValueError: Expected more than 1 spatial element when training, got input size torch.Size([8, 768, 1, 1, 1])

My data ran perfectly with nnUNet (https://github.com/MIC-DKFZ/nnUNet) for comparison, and it is to be noted that some of my nifti images have different sizes.

Is it possible to obtain the pre-training data?

Hi, thank you for your interesting work.

I notice that in the paper "UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation", the author states that they use a CT dataset containing 5050 slices for pre-training. The dataset is gathered from public sets.

I am wondering if it is possible to get the pre-training data? Or where can I get them from a public URL?

Parameters of the problem

Hi, while I set pos_embed: str = "conv", should I need to change conv_block: bool = False into conv_block: bool = True?

Add UNETR Repository

Is your feature request related to a problem? Please describe.
Add UNETR repository for various tasks such as 3D multi-organ segmentation using BTCV dataset. The model is based on:

UNETR based on: "Hatamizadeh et al., UNETR: Transformers for 3D Medical Image Segmentation (https://arxiv.org/abs/2103.10504)"

Describe the solution you'd like
A pull request can address this feature request.

Typo

Describe the bug

Hello,

Seems the last item of this line and this line should be args.roi_z instead of args.roi_x. Similar issues also present in other folders of SwinUNETR.

Train coplenet

I would like to have a training code for Cople net. Help us to use Monai framework with sota models. How can we process the nifty images for doing training...? How to do image augmentation on 2d slices for Cople net?

labels do not match images in the cropping results (UNETER-BTCV tutorial)

Describe the bug

I visualize the image-label patch pairs in the following tutorial.
https://github.com/Project-MONAI/tutorials/blob/master/3d_segmentation/unetr_btcv_segmentation_3d.ipynb
It seems that the label does not match the images. Some examples are provided as follows

To Reproduce
Steps to reproduce the behavior:

Go to https://github.com/Project-MONAI/tutorials/blob/master/3d_segmentation/unetr_btcv_segmentation_3d.ipynb
After finishing the data loading, run commands

case_num = 0
crop_id = 1
img_name = os.path.split(train_ds[case_num][crop_id]["image_meta_dict"]["filename_or_obj"])[1]
label_name = os.path.split(train_ds[case_num][crop_id]["label_meta_dict"]["filename_or_obj"])[1]
img = train_ds[case_num][crop_id]["image"]
label = train_ds[case_num][crop_id]["label"]

img_shape = img.shape
label_shape = label.shape
print(f"image shape: {img_shape}, label shape: {label_shape}")
plt.figure("image", (18, 6))
plt.subplot(1, 2, 1)
plt.title(img_name)
plt.imshow(img[0, :, :, 36].detach().cpu(), cmap="gray")
plt.subplot(1, 2, 2)
plt.title(label_name)
plt.imshow(label[0, :, :, 36].detach().cpu())
plt.show()

SwinUNETR on BTCV with unused CLI arguments

Describe the bug
The training script for SwinUNETR on BTCV includes CLI arguments for smoothing coefficients in the Dice loss. However, those coefficients are never used. Were they used for training the original models reported in the publication?

research-contributions/SwinUNETR/BTCV/main.py

Lines 78 to 79 in 46279a9

    
           parser.add_argument('--smooth_dr', default=1e-6, type=float, help='constant added to dice denominator to avoid nan') 
        
           parser.add_argument('--smooth_nr', default=0.0, type=float, help='constant added to dice numerator to avoid zero')

research-contributions/SwinUNETR/BTCV/main.py

Lines 144 to 146 in 46279a9

    
           dice_loss = DiceCELoss(to_onehot_y=True, 
        
                                  softmax=True 
        
                                  )

@ahatamiz

After the training, the accuracy is 0. “Training Finished !, Best Accuracy: 0.0” What is the situation

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Install '....'
Run commands '....'

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS
Python version
MONAI version [e.g. git commit hash]
CUDA/cuDNN version
GPU models and configuration

Additional context
Add any other context about the problem here.
@

Best Accuracy: 0.0

Describe the bug
I update the trainer.py, but I still got the issues that Training Finished !, Best Accuracy: 0.0.

To Reproduce
Steps to reproduce the behavior:

Go to 'UNETR/BTCV'
Install '....'
Run commands 'main.py'

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS
Python version
MONAI version [e.g. git commit hash]
CUDA/cuDNN version
GPU models and configuration

Additional context
Add any other context about the problem here.

About Pretraining Data Formats

I downloaded the dataset for pre-training on TCIA, but I found that the downloaded data format is .dcm, which is inconsistent with the format of .nii.gz in the json file. I wonder if something is done to do the format conversion？

Offline save predict mask after UNETR

How to offline save predict result after UNETR? Thanks!

And why you caculate dices using reshaped masks? I don't think this is a correct procedure. Instead, this is a trick.

You should save size-restored prediction results offline, and caculate dices with original big masks.

WARNING:root:NaN or Inf found in input tensor when training UNETR network.

Describe the bug
Nan popped up while I was training the UNETR network.
WARNING:root:NaN or Inf found in input tensor.

To Reproduce
Steps to reproduce the behavior:

Go to 'UNETR/BTCV'
Run commands:
CUDA_VISIBLE_DEVICES=1,2,3 python main.py --distributed --feature_size=32 --batch_size=4 --logdir=unetr_test --optim_lr=1e-3 --lrschedule=warmup_cosine --infer_overlap=0.5 --save_checkpoint --data_dir=/data/BTCV/Abdomen/RawData/Training/ --workers=12

Screenshots

Environment (please complete the following information):

OS
Python version 3.8
MONAI version [e.g. git commit hash]
CUDA/cuDNN version CUDA11.1
GPU models and configuration. V100

Split of brain and spleen segmentation tasks in MSD dataset of UNETR

Hello, I noticed your paper of UNETR say " For brain and spleen segmentation tasks in MSD dataset, we split the data into training, validation and test with a ratio of 80:15:5."
Could you share the exact json file of brain tumor dataset? Just like that one of BTCV challenge dataset in this link: https://drive.google.com/file/d/1t4fIQQkONv7ArTSZe4Nucwkk1KfdUDvW/view?usp=sharing

unetr for classification

Hello

first thanks for sharing this nice work.

May be a stupid question, but I give a try:
Would it be possible to modify the model architecture, to do a classification task (volume wise classification instead of segmentation (pixel wise classification) ?

would it make sense ?

RuntimeError when using the pretrained TorchScript model

Hi,
I have tried to test the pretrained TorchScript model on my dataset. I followed the instructions on where to save the pretrained model and how to create the dataset. However when I try to run the file 'test.py' with the following command:

python test.py
--infer_overlap=0.5
--data_dir=/dataset/dataset0/
--pretrained_dir='./pretrained_models/'
--saved_checkpoint=torchscript

I get the following error:

RuntimeError: PytorchStreamReader failed locating file constants.pkl: file not found

Searching the error in stack overflow, I think that the .pt file that you have uploaded might be corrupted. Could you please help me solve this issue?

Thanks in advance!

Can you share the loss curve of the Swin-UNETR pre-training process

Hi, Thanks for your great work on Swin-UNETR, I am trying to run pre-training on another dataset (~2000 CTs). But the loss curve seems not to be decreased:

Could you share your loss curve on the 5050 CTs dataset? Thank you very much!

I am pre-training the model on a single GPU with batch size 2.

Why choose the last epoch’s accuracy as final validation accuracy in the validation stage?

HI, When I was trying to train the UNETR, I found that the last epoch result was always selected as validation accuracy, But I am confused why not to take the average?

Source and description of the sample dataset dataset1，dataset2 and dataset3

Thank you for your work, I used your dataset in my experiments. Could you please provide the source and description of the sample dataset dataset1，dataset2 and dataset3?

rename 'master' to 'main'

Is your feature request related to a problem? Please describe.
Project-MONAI/tutorials#717

Slow model convergence

Hello, thanks for your nice work.
I followed your training strategy in UNTER and trained model in another dataset of 13 organ segmentation. But the model converges very slowly. I using the CE and Dice loss with Lr=0.0001. After 200 epoch training, my CEloss is low(0.1) and Diceloss is high(0.9). Dice in Validation is only 0.25. Is this normal？Are there problem in the network architecture or strategy. (45train 5val)
thanks for your reply.

number of parameters (model Swin UNETR)?

Hi,
I want to know, what is the number of parameters of model Swin UNETR?

About Pretrained Swin Encoder Model

Hi, thank you for your great work.
I notice the paper Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis. The paper says that you trained Swin Transformer encoder in several Medical Dataset.I wonder you could release the Pretrained Encoder Model.Thank you.I'm waiting for your reply.

question about test.py

when i use UNETR_model_best_acc.pt,i can get the
result,but when i use my own model.pt,there have RuntimeError: PytorchStreamReader failed locating file constants.pkl: file not found.how can i solve it?

Reproduce the Paper Results

Hi Authors,

I wonder if you can provide the detailed instructions (and pretrained models as well ideally) to reproduce the results reported in the paper (0.918 avg. dice score). Thanks

ablation study on curriculum learning

Hi. I read your paper on LAMP which is quite interesting and practical.
But I has one question though. In the paper, you mentioned curriculum learning for training models with large patch size. Yet I did not see any ablation study on this.

In the 3rd paragraph of the Experiment section, different learning schedules were used for different patch size. My understanding is that you trained the model with large patch size only for one round without resorting to the strategy of gradual increase of patch size. Yet I'm not so sure. Could you please clarify a bit. Thank you.

Inconsistence with the results between the tutorial in MONAI and the UNETR paper

Dear authors!

Thanks for your great work and it is insighting for me !

However, I have found an inconsistency with the results between the tutorial in MONAI and the UNETR paper.

In the tutorial and with my own re-implentation, the results is about 0.79 while is 0.89 in the paper.

What's wrong with that, I have found the paper claimed that other training cases are introduced to 80 volumes, is that the main reason?

A weird CUDA illegal memory access error when running the uneter BTCV tutorial

Describe the bug
Thanks for sharing the great work.

I try to run the following unetr tutorial with BTCV dataset.
https://github.com/Project-MONAI/tutorials/blob/master/3d_segmentation/unetr_btcv_segmentation_3d.ipynb

I got a weird CUDA illegal memory access error during training.

I also try to add the following setting to identify the error position. The error occurred at loss.backward().

os.environ['CUDA_LAUNCH_BLOCKING'] = '1'

To Reproduce

import os
import shutil
import tempfile

import matplotlib.pyplot as plt
import numpy as np
from tqdm import tqdm

from monai.losses import DiceCELoss
from monai.inferers import sliding_window_inference
from monai.transforms import (
    AsDiscrete,
    AddChanneld,
    Compose,
    CropForegroundd,
    LoadImaged,
    Orientationd,
    RandFlipd,
    RandCropByPosNegLabeld,
    RandShiftIntensityd,
    ScaleIntensityRanged,
    Spacingd,
    RandRotate90d,
    ToTensord,
)

from monai.config import print_config
from monai.metrics import DiceMetric
from monai.networks.nets import UNETR

from monai.data import (
    DataLoader,
    CacheDataset,
    load_decathlon_datalist,
    decollate_batch,
)


import torch

print_config()

root_dir = '/home/jma/Documents/monai/UNETR'

train_transforms = Compose(
    [
        LoadImaged(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        Spacingd(
            keys=["image", "label"],
            pixdim=(1.5, 1.5, 2.0),
            mode=("bilinear", "nearest"),
        ),
        Orientationd(keys=["image", "label"], axcodes="RAS"),
        ScaleIntensityRanged(
            keys=["image"],
            a_min=-175,
            a_max=250,
            b_min=0.0,
            b_max=1.0,
            clip=True,
        ),
        CropForegroundd(keys=["image", "label"], source_key="image"),
        RandCropByPosNegLabeld(
            keys=["image", "label"],
            label_key="label",
            spatial_size=(96, 96, 96),
            pos=1,
            neg=1,
            num_samples=4,
            image_key="image",
            image_threshold=0,
        ),
        RandFlipd(
            keys=["image", "label"],
            spatial_axis=[0],
            prob=0.10,
        ),
        RandFlipd(
            keys=["image", "label"],
            spatial_axis=[1],
            prob=0.10,
        ),
        RandFlipd(
            keys=["image", "label"],
            spatial_axis=[2],
            prob=0.10,
        ),
        RandRotate90d(
            keys=["image", "label"],
            prob=0.10,
            max_k=3,
        ),
        RandShiftIntensityd(
            keys=["image"],
            offsets=0.10,
            prob=0.50,
        ),
        ToTensord(keys=["image", "label"]),
    ]
)
# val_transforms = Compose(
#     [
#         LoadImaged(keys=["image", "label"]),
#         AddChanneld(keys=["image", "label"]),
#         Spacingd(
#             keys=["image", "label"],
#             pixdim=(1.5, 1.5, 2.0),
#             mode=("bilinear", "nearest"),
#         ),
#         Orientationd(keys=["image", "label"], axcodes="RAS"),
#         ScaleIntensityRanged(
#             keys=["image"], a_min=-175, a_max=250, b_min=0.0, b_max=1.0, clip=True
#         ),
#         CropForegroundd(keys=["image", "label"], source_key="image"),
#         ToTensord(keys=["image", "label"]),
#     ]
# )

#%% load data
data_dir = "./dataset/"
split_JSON = "BTCV.json"
datasets = data_dir + split_JSON
datalist = load_decathlon_datalist(datasets, True, "training")
val_files = load_decathlon_datalist(datasets, True, "validation")
train_ds = CacheDataset(
    data=datalist,
    transform=train_transforms,
    cache_num=24,
    cache_rate=1.0,
    num_workers=1,
)
train_loader = DataLoader(
    train_ds, batch_size=1, shuffle=True, num_workers=6, pin_memory=True
)
# val_ds = CacheDataset(
#     data=val_files, transform=val_transforms, cache_num=6, cache_rate=1.0, num_workers=4
# )
# val_loader = DataLoader(
#     val_ds, batch_size=1, shuffle=False, num_workers=1, pin_memory=True
# )


#%% Check dataset

# slice_map = {
#     "img0035.nii.gz": 170,
#     "img0036.nii.gz": 230,
#     "img0037.nii.gz": 204,
#     "img0038.nii.gz": 204,
#     "img0039.nii.gz": 204,
#     "img0040.nii.gz": 180,
# }
# case_num = 0
# img_name = os.path.split(val_ds[case_num]["image_meta_dict"]["filename_or_obj"])[1]
# img = val_ds[case_num]["image"]
# label = val_ds[case_num]["label"]
# img_shape = img.shape
# label_shape = label.shape
# print(f"image shape: {img_shape}, label shape: {label_shape}")
# plt.figure("image", (18, 6))
# plt.subplot(1, 2, 1)
# plt.title("image")
# plt.imshow(img[0, :, :, slice_map[img_name]].detach().cpu(), cmap="gray")
# plt.subplot(1, 2, 2)
# plt.title("label")
# plt.imshow(label[0, :, :, slice_map[img_name]].detach().cpu())
# plt.show()


#%% Create model, loss, optimizer
# os.environ['CUDA_LAUNCH_BLOCKING'] = '1'
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = UNETR(
    in_channels=1,
    out_channels=14,
    img_size=(96, 96, 96),
    feature_size=16,
    hidden_size=768,
    mlp_dim=3072,
    num_heads=12,
    pos_embed="perceptron",
    norm_name="instance",
    res_block=True,
    dropout_rate=0.0,
).to(device)

loss_function = DiceCELoss(to_onehot_y=True, softmax=True)
torch.backends.cudnn.benchmark = True
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4, weight_decay=1e-5)

def train(global_step, train_loader, dice_val_best, global_step_best):
    model.train()
    epoch_loss = 0
    step = 0
    epoch_iterator = tqdm(
        train_loader, desc="Training (X / X Steps) (loss=X.X)", dynamic_ncols=True
    )
    for step, batch in enumerate(epoch_iterator):
        step += 1
        x, y = (batch["image"].cuda(), batch["label"].cuda())
        logit_map = model(x)
        loss = loss_function(logit_map, y)
        loss.backward()
        epoch_loss += loss.item()
        optimizer.step()
        optimizer.zero_grad()
        epoch_iterator.set_description(
            "Training (%d / %d Steps) (loss=%2.5f)" % (global_step, max_iterations, loss)
        )
        if (
            global_step % eval_num == 0 and global_step != 0
        ) or global_step == max_iterations:
            epoch_iterator_val = tqdm(
                val_loader, desc="Validate (X / X Steps) (dice=X.X)", dynamic_ncols=True
            )
            dice_val = validation(epoch_iterator_val)
            epoch_loss /= step
            epoch_loss_values.append(epoch_loss)
            metric_values.append(dice_val)
            if dice_val > dice_val_best:
                dice_val_best = dice_val
                global_step_best = global_step
                torch.save(
                    model.state_dict(), os.path.join(root_dir, "best_metric_model.pth")
                )
                print(
                    "Model Was Saved ! Current Best Avg. Dice: {} Current Avg. Dice: {}".format(
                        dice_val_best, dice_val
                    )
                )
            else:
                print(
                    "Model Was Not Saved ! Current Best Avg. Dice: {} Current Avg. Dice: {}".format(
                        dice_val_best, dice_val
                    )
                )
        global_step += 1
    return global_step, dice_val_best, global_step_best


max_iterations = 25000
eval_num = 500
post_label = AsDiscrete(to_onehot=14)
post_pred = AsDiscrete(argmax=True, to_onehot=14)
dice_metric = DiceMetric(include_background=True, reduction="mean", get_not_nans=False)
global_step = 0
dice_val_best = 0.0
global_step_best = 0
epoch_loss_values = []
metric_values = []
while global_step < max_iterations:
    global_step, dice_val_best, global_step_best = train(
        global_step, train_loader, dice_val_best, global_step_best
    )
model.load_state_dict(torch.load(os.path.join(root_dir, "best_metric_model.pth")))

Any comments would be highly appreciated.

Environments

Ubuntu 20.04
NVIDIA 2080 Ti
CUDA 11.4

contributing guideline

needs a contributing guideline for this repo
c.f. Project-MONAI/MONAI#1015

why set the hidde_size = 768 while the patch_size = [16, 16, 16]?

we always set the hidden_size = 768 in 2d Vit model because the patch_dim = 16163
when it comes to 3D patchs why not set as 161616?

data split of UNETR

Thanks for your great work! Can you provide me with the data split of the Brain Tumour, spleen and BTVC? Thanks you a lot.

UNETR: Coping with limited data

Hello,
First of all, thank you very much for you great work!

I would have a question regarding how you coped with the issue of having only very limited training data available (spleen segmentation: 41 CT scans). Transformer based architectures like ViT or also detectors like DeTr have shown to only perform well when there is huge amount of labeled data available (DeTr lower bound of 2D images ~15k to train from scratch) and are known to converge very slowly. So I would think that training a 3D transformer based architecture like UNETR would even be more data hungry and result in overfitting as they converge so slowly and there is only limited data available.

So my question is basically: what are in your opinion the key-factors of the success of your approach when it comes to limited data? Is the random sampling to 96x96x96 the main factor that tackles this issue? Wouldn't the performance increase if you wouldn't do random sampling and instead use the whole ct scan as input to have complete global information for attention?

Furthermore, I would be interested in why your transformer encoder converges so quickly (10h) in comparison to original ViT.

I would be very happy if you could answer my questions.
BR Bastian

The structure of dataset

Hello,

First of all, thank you so much for this brilliant model. I have some questions.

I have downloaded the dataset from the link embedded in the notebook, but that dataset I have downloaded has just one label. Is it right, or have I downloaded it by mistake?
Another question is about the structure of ground truth. My dataset has a binary map for each class. So I must merge these binary maps to become these labels into one channel aiming to use your model?

I appreciate any guidance.

stride in decoder 5 & 4 of UNERT ?

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Install '....'
Run commands '....'

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS
Python version
MONAI version [e.g. git commit hash]
CUDA/cuDNN version
GPU models and configuration

Additional context
Add any other context about the problem here.

RuntimeError: CUDA out of memory.

I got CUDA out of memory during inference. How to solve this problem? (I have a total of 60 validation images and the code stopped after processing the 52nd image). Thanks

sliding_window_inference
count_map = torch.zeros(output_shape, dtype=torch.float32, device=device)
RuntimeError: CUDA out of memory. Tried to allocate 1.57 GiB (GPU 0; 11.91 GiB total capacity; 9.01 GiB already allocated; 826.94 MiB free; 10.27 GiB reserved in total by PyTorch)

Val 99/5000 0/60 acc 0.69666123 time 71.59s
Val 99/5000 1/60 acc 0.7665866 time 19.51s
Val 99/5000 2/60 acc 0.74568444 time 19.14s
Val 99/5000 3/60 acc 0.7421014 time 41.73s
Val 99/5000 4/60 acc 0.74470216 time 14.27s
Val 99/5000 5/60 acc 0.6023531 time 48.11s
Val 99/5000 6/60 acc 0.63559777 time 47.31s
Val 99/5000 7/60 acc 0.7078452 time 28.05s
Val 99/5000 8/60 acc 0.7461889 time 28.38s
Val 99/5000 9/60 acc 0.72240114 time 25.68s
Val 99/5000 10/60 acc 0.7349031 time 19.32s
Val 99/5000 11/60 acc 0.5812509 time 51.07s
Val 99/5000 12/60 acc 0.56357753 time 51.10s
Val 99/5000 13/60 acc 0.7025258 time 39.52s
Val 99/5000 14/60 acc 0.6934635 time 34.07s
Val 99/5000 15/60 acc 0.71059304 time 43.36s
Val 99/5000 16/60 acc 0.6659618 time 47.43s
Val 99/5000 17/60 acc 0.6793316 time 61.28s
Val 99/5000 18/60 acc 0.51980585 time 21.26s
Val 99/5000 19/60 acc 0.7111399 time 47.38s
Val 99/5000 20/60 acc 0.68195075 time 43.40s
Val 99/5000 21/60 acc 0.7277525 time 61.09s
Val 99/5000 22/60 acc 0.5373005 time 18.68s
Val 99/5000 23/60 acc 0.63938075 time 23.21s
Val 99/5000 24/60 acc 0.5794002 time 55.23s
Val 99/5000 25/60 acc 0.59343034 time 20.44s
Val 99/5000 26/60 acc 0.5908409 time 27.87s
Val 99/5000 27/60 acc 0.6639712 time 23.38s
Val 99/5000 28/60 acc 0.6337707 time 10.55s
Val 99/5000 29/60 acc 0.636496 time 23.24s
Val 99/5000 30/60 acc 0.62790823 time 9.39s
Val 99/5000 31/60 acc 0.59033436 time 21.28s
Val 99/5000 32/60 acc 0.6143217 time 4.17s
Val 99/5000 33/60 acc 0.6142419 time 27.90s
Val 99/5000 34/60 acc 0.6083099 time 60.02s
Val 99/5000 35/60 acc 0.6023102 time 46.00s
Val 99/5000 36/60 acc 0.5558359 time 22.66s
Val 99/5000 37/60 acc 0.59874105 time 23.64s
Val 99/5000 38/60 acc 0.6033232 time 12.78s
Val 99/5000 39/60 acc 0.6222449 time 4.19s
Val 99/5000 40/60 acc 0.61932653 time 46.00s
Val 99/5000 41/60 acc 0.57913965 time 23.35s
Val 99/5000 42/60 acc 0.66750497 time 25.70s
Val 99/5000 43/60 acc 0.6848543 time 26.12s
Val 99/5000 44/60 acc 0.6810599 time 25.68s
Val 99/5000 45/60 acc 0.6120205 time 39.74s
Val 99/5000 46/60 acc 0.6246956 time 39.86s
Val 99/5000 47/60 acc 0.58194286 time 39.77s
Val 99/5000 48/60 acc 0.5662599 time 47.26s
Val 99/5000 49/60 acc 0.56202275 time 27.64s
Val 99/5000 50/60 acc 0.70149297 time 54.35s
Val 99/5000 51/60 acc 0.7281418 time 30.59s
Val 99/5000 52/60 acc 0.6984218 time 68.52s

Computation time

Hi all,

I'm training a COVID-19 lesion segmentation model using COPLE-NET.
The computation time to train the model is very high: 3 days for 11 epochs, with around 200 CTs in my training set.
This seems a bit long to me.
Is this normal training time for COPLE-NET? Or am I the only one facing these long training times?

preprocessing

thank you for your work!!
I have a question about the result. I split the 3d data with 4x160x160x160 into 4x32x160x160 because of the limited memory.
but when it trained,the validation dice reduced.I would appreciate it if you can give me some advice.

Hi! When the code of Swin UNETR will be release? it is a interesting work : )

main.py: error: unrecognized arguments: --fold=0

Hello , i run your file for main.py by this command,
python main.py
--feature_size=32
--batch_size=1
--logdir=unetr_test
--fold=0
--optim_lr=1e-4
--lrschedule=warmup_cosine
--infer_overlap=0.5
--save_checkpoint
--data_dir=/dataset/dataset0/

but there is a bug that "main.py: error: unrecognized arguments: --fold=0",i want to know what mean of this parameter that "fold "?
thak you very much

DATASET Problem

Hello! Each CT in the btcv data set used in the code contains all 14 organs. I want to apply unetr to the ribtrac data set. Each CT contains at most 5 types of rib fracture (most CT only contains 2-4 types). How can I change the code? Looking forward to your reply, thank you

update the instructions pip install monai

update the relevant readme files to use pip install monai==0.2.0 instead of git clone ...

	parser.add_argument('--smooth_dr', default=1e-6, type=float, help='constant added to dice denominator to avoid nan')
	parser.add_argument('--smooth_nr', default=0.0, type=float, help='constant added to dice numerator to avoid zero')

project-monai / research-contributions Goto Github PK

research-contributions's People

Contributors

Stargazers

Watchers

Forkers

research-contributions's Issues

Recommend Projects

Recommend Topics

Recommend Org