Code Monkey home page Code Monkey logo

nnunet's Introduction

Welcome to the new nnU-Net!

Click here if you were looking for the old one instead.

Coming from V1? Check out the TLDR Migration Guide. Reading the rest of the documentation is still strongly recommended ;-)

2024-04-18 UPDATE: New residual encoder UNet presets available!

Residual encoder UNet presets substantially improve segmentation performance. They ship for a variety of GPU memory targets. It's all awesome stuff, promised! Read more 👉 here 👈

Also check out our new paper on systematically benchmarking recent developments in medical image segmentation. You might be surprised!

What is nnU-Net?

Image datasets are enormously diverse: image dimensionality (2D, 3D), modalities/input channels (RGB image, CT, MRI, microscopy, ...), image sizes, voxel sizes, class ratio, target structure properties and more change substantially between datasets. Traditionally, given a new problem, a tailored solution needs to be manually designed and optimized - a process that is prone to errors, not scalable and where success is overwhelmingly determined by the skill of the experimenter. Even for experts, this process is anything but simple: there are not only many design choices and data properties that need to be considered, but they are also tightly interconnected, rendering reliable manual pipeline optimization all but impossible!

nnU-Net overview

nnU-Net is a semantic segmentation method that automatically adapts to a given dataset. It will analyze the provided training cases and automatically configure a matching U-Net-based segmentation pipeline. No expertise required on your end! You can simply train the models and use them for your application.

Upon release, nnU-Net was evaluated on 23 datasets belonging to competitions from the biomedical domain. Despite competing with handcrafted solutions for each respective dataset, nnU-Net's fully automated pipeline scored several first places on open leaderboards! Since then nnU-Net has stood the test of time: it continues to be used as a baseline and method development framework (9 out of 10 challenge winners at MICCAI 2020 and 5 out of 7 in MICCAI 2021 built their methods on top of nnU-Net, we won AMOS2022 with nnU-Net)!

Please cite the following paper when using nnU-Net:

Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: a self-configuring 
method for deep learning-based biomedical image segmentation. Nature methods, 18(2), 203-211.

What can nnU-Net do for you?

If you are a domain scientist (biologist, radiologist, ...) looking to analyze your own images, nnU-Net provides an out-of-the-box solution that is all but guaranteed to provide excellent results on your individual dataset. Simply convert your dataset into the nnU-Net format and enjoy the power of AI - no expertise required!

If you are an AI researcher developing segmentation methods, nnU-Net:

  • offers a fantastic out-of-the-box applicable baseline algorithm to compete against
  • can act as a method development framework to test your contribution on a large number of datasets without having to tune individual pipelines (for example evaluating a new loss function)
  • provides a strong starting point for further dataset-specific optimizations. This is particularly used when competing in segmentation challenges
  • provides a new perspective on the design of segmentation methods: maybe you can find better connections between dataset properties and best-fitting segmentation pipelines?

What is the scope of nnU-Net?

nnU-Net is built for semantic segmentation. It can handle 2D and 3D images with arbitrary input modalities/channels. It can understand voxel spacings, anisotropies and is robust even when classes are highly imbalanced.

nnU-Net relies on supervised learning, which means that you need to provide training cases for your application. The number of required training cases varies heavily depending on the complexity of the segmentation problem. No one-fits-all number can be provided here! nnU-Net does not require more training cases than other solutions - maybe even less due to our extensive use of data augmentation.

nnU-Net expects to be able to process entire images at once during preprocessing and postprocessing, so it cannot handle enormous images. As a reference: we tested images from 40x40x40 pixels all the way up to 1500x1500x1500 in 3D and 40x40 up to ~30000x30000 in 2D! If your RAM allows it, larger is always possible.

How does nnU-Net work?

Given a new dataset, nnU-Net will systematically analyze the provided training cases and create a 'dataset fingerprint'. nnU-Net then creates several U-Net configurations for each dataset:

  • 2d: a 2D U-Net (for 2D and 3D datasets)
  • 3d_fullres: a 3D U-Net that operates on a high image resolution (for 3D datasets only)
  • 3d_lowres3d_cascade_fullres: a 3D U-Net cascade where first a 3D U-Net operates on low resolution images and then a second high-resolution 3D U-Net refined the predictions of the former (for 3D datasets with large image sizes only)

Note that not all U-Net configurations are created for all datasets. In datasets with small image sizes, the U-Net cascade (and with it the 3d_lowres configuration) is omitted because the patch size of the full resolution U-Net already covers a large part of the input images.

nnU-Net configures its segmentation pipelines based on a three-step recipe:

  • Fixed parameters are not adapted. During development of nnU-Net we identified a robust configuration (that is, certain architecture and training properties) that can simply be used all the time. This includes, for example, nnU-Net's loss function, (most of the) data augmentation strategy and learning rate.
  • Rule-based parameters use the dataset fingerprint to adapt certain segmentation pipeline properties by following hard-coded heuristic rules. For example, the network topology (pooling behavior and depth of the network architecture) are adapted to the patch size; the patch size, network topology and batch size are optimized jointly given some GPU memory constraint.
  • Empirical parameters are essentially trial-and-error. For example the selection of the best U-net configuration for the given dataset (2D, 3D full resolution, 3D low resolution, 3D cascade) and the optimization of the postprocessing strategy.

How to get started?

Read these:

Additional information:

Competitions:

Where does nnU-Net perform well and where does it not perform?

nnU-Net excels in segmentation problems that need to be solved by training from scratch, for example: research applications that feature non-standard image modalities and input channels, challenge datasets from the biomedical domain, majority of 3D segmentation problems, etc . We have yet to find a dataset for which nnU-Net's working principle fails!

Note: On standard segmentation problems, such as 2D RGB images in ADE20k and Cityscapes, fine-tuning a foundation model (that was pretrained on a large corpus of similar images, e.g. Imagenet 22k, JFT-300M) will provide better performance than nnU-Net! That is simply because these models allow much better initialization. Foundation models are not supported by nnU-Net as they 1) are not useful for segmentation problems that deviate from the standard setting (see above mentioned datasets), 2) would typically only support 2D architectures and 3) conflict with our core design principle of carefully adapting the network topology for each dataset (if the topology is changed one can no longer transfer pretrained weights!)

What happened to the old nnU-Net?

The core of the old nnU-Net was hacked together in a short time period while participating in the Medical Segmentation Decathlon challenge in 2018. Consequently, code structure and quality were not the best. Many features were added later on and didn't quite fit into the nnU-Net design principles. Overall quite messy, really. And annoying to work with.

nnU-Net V2 is a complete overhaul. The "delete everything and start again" kind. So everything is better (in the author's opinion haha). While the segmentation performance remains the same, a lot of cool stuff has been added. It is now also much easier to use it as a development framework and to manually fine-tune its configuration to new datasets. A big driver for the reimplementation was also the emergence of Helmholtz Imaging, prompting us to extend nnU-Net to more image formats and domains. Take a look here for some highlights.

Acknowledgements

nnU-Net is developed and maintained by the Applied Computer Vision Lab (ACVL) of Helmholtz Imaging and the Division of Medical Image Computing at the German Cancer Research Center (DKFZ).

nnunet's People

Contributors

ancestor-mithril avatar andife avatar arthursw avatar fabianisensee avatar jagh avatar joeranbosma avatar joshestein avatar joshuacwnewton avatar karol-g avatar kck278 avatar mrokuss avatar mzenk avatar nils-christianiseke avatar pdelgado248 avatar petermfull avatar plasma-blue avatar schuhegger avatar silvandeleemput avatar sten2lu avatar strasserpatrick avatar tawald avatar vincentme avatar wasserth avatar whikwon avatar willgiang avatar yarikoptic avatar yinglilu avatar ykirchhoff avatar yulv-git avatar ziqiangxu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nnunet's Issues

How can we take full advantage of multiple GPUs to train several models at the same time?

Hi @FabianIsensee ,

Thanks for the great repository. I test it on Linux and it works very well.

Q1. How can we take full advantage of multiple GPUs to train several models at the same time?

My GPU server has 8 Nvidia TitanX GPUs (12GB). I want to train multiple models at the same time using different GPU. For example, training a 2D U-Net, 3D U-Net (full resolution) with different folds.

CUDA_VISIBLE_DEVICES=1 python run/run_training.py 2d nnUNetTrainer TaskXX_MY_DATASET all --ndet
CUDA_VISIBLE_DEVICES=2 python run/run_training.py 3d_fullres nnUNetTrainer TaskXX_MY_DATASET 0 --ndet
CUDA_VISIBLE_DEVICES=3 python run/run_training.py 3d_fullres nnUNetTrainer TaskXX_MY_DATASET 1 --ndet

However, if I do this (eg. only training 2 models), the Linux server tends to be stuck. The training time of each epoch increases by 60% for each training task. To be precise, when I run one run_training.py, each epoch costs ~480s. If I run two run_training.py, each epoch costs ~800s for single training task. So I can only run one training task at a time. The remained GPUs are not be fully used. How can I efficiently train several models at the same time?

Q2. Following is a screenshot of htop when running one training task. Why one training task costs 32 PIDs?

htop

Q3.1 Does nnU-Net use a cascaded way for a large dataset (e.g. LiTS) even if run with 3d_fullres?

Markdown

Q3.2 When counting the number of class (num_classes), nnunet does not count background class. right?

A typo in readme->Inference

python inference/predict_simple.py -i INPUT_FOLDER -o OUTPUT_FOLDER_CASCADE -t TaskXX_MY_DATASET -tr nnUNetTrainerCascadeFullRes -m 3d_fullres_cascade -l OUTPUT_FOLDER_LOWRES
It should be 3d_cascade_fullres rahter than 3d_fullres_cascade.

I'm looking forward to your reply.
Best regards,
Jun

Cannot run final stage of cascade

Dear Fabian,

Thanks for the great respo.

I want to use 3D U-Net Cascade.
Firstly, I run

for Fold in [0,4]
python run/run_training.py 3d_lowres nnUNetTrainer TaskXX_MY_DATASET FOLD --ndet

and nnUNet generates predictions of validation dataset in each folder.

Markdown

Markdown

Then, I run

python run/run_training.py 3d_cascade_fullres nnUNetTrainerCascadeFullRes TaskXX_MY_DATASET 0 --ndet

but I get following error

###############################################
Traceback (most recent call last):
  File "run/run_training.py", line 90, in <module>
    batch_dice=batch_dice, stage=stage, unpack_data=unpack, deterministic=deterministic)
  File "/home/jma/Code/nnUNet/nnunet/training/network_training/nnUNetTrainerCascadeFullRes.py", line 31, in __init__
    "Cannot run final stage of cascade. Run corresponding 3d_lowres first and predict the "
RuntimeError: Cannot run final stage of cascade. Run corresponding 3d_lowres first and predict the segmentations for the next stage

I'm confused about the error, because **the segmentations have been generated automatically during the 3d_lowres step (in validation folder). **

Could you give some insights on this error?

Looking forward to your reply.
Best,
Jun

self.dataset_properties['all_spacings'] is empty

Hi there, very nice work!
I'm trying to run your nnUnet on LiTS dataset, but weirdly it falls after cropping when running plan_and_preprocess since the field "all_spacings" doen't is empty in the dataset_properties class loaded from the pickle file. Both 'all_spacings' and 'all_sizes' are empty, but the other fields are filled with information that makes sense (modality, all_classes, class_dct...) so I wonder why only these 2 fields are empty... Thanks!

this is the specific line where it falls in experiment_planner_baseline_3DUNet.py:

target = np.percentile(np.vstack(spacings), TARGET_SPACING_PERCENTILE, 0)
since the field is empty fr some reason...

This is the error:

Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.3.3\helpers\pydev\pydevd.py", line 1668, in
main()
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.3.3\helpers\pydev\pydevd.py", line 1662, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.3.3\helpers\pydev\pydevd.py", line 1072, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2017.3.3\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/michal/liverLesions/nnUNet/nnunet/experiment_planning/plan_and_preprocess_task.py", line 238, in
plan_and_preprocess(task, processes, no_preprocessing)
File "D:/michal/liverLesions/nnUNet/nnunet/experiment_planning/plan_and_preprocess_task.py", line 136, in plan_and_preprocess
exp_planner.plan_experiment()
File "D:\michal\liverLesions\nnUNet\nnunet\experiment_planning\experiment_planner_baseline_3DUNet.py", line 229, in plan_experiment
target_spacing = self.get_target_spacing()
File "D:\michal\liverLesions\nnUNet\nnunet\experiment_planning\experiment_planner_baseline_3DUNet.py", line 60, in get_target_spacing
target = np.percentile(np.vstack(spacings), TARGET_SPACING_PERCENTILE, 0)
File "C:\Users\admin\Anaconda3\envs\pytorch\lib\site-packages\numpy\core\shape_base.py", line 283, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: need at least one array to concatenate

Parameters in 'resample_data_or_seg'

Hi Fabian,

I have problems for the function of 'reample_data_or_seg', since it looks so important in the preprocessing.
def resample_data_or_seg(data, new_shape, is_seg, axis=None, order=3, do_separate_z=False, cval=0, order_z=0): """ separate_z=True will resample with order 0 along z :param data: :param new_shape: :param is_seg: :param axis: :param order: :param do_separate_z: :param cval: :param order_z: only applies if do_separate_z is True :return: """
But I am wondering the parameters like: do_seperate_z, how would it affect the resampling?
And the axis's meaning here.

Thank you for you help and reply.

Best Regards

Problem with Cascade Training

Hey Fabian,

Thanks for your fantastic and brilliant work!

I recently encountered a problem when trying to run cascade training.

###############################################
2019-07-17 15:23:20.822183: Using dummy2d data augmentation
2019-07-17 15:23:22.990462: Using dummy2d data augmentation
Traceback (most recent call last):
File "run/run_training.py", line 96, in <module>
    trainer.initialize(not validation_only)
File "nnUNet/nnunet/training/network_training/nnUNetTrainerCascadeFullRes.py", line 118, in initialize
    self.dl_tr, self.dl_val = self.get_basic_generators()
File "nnUNet/nnunet/training/network_training/nnUNetTrainerCascadeFullRes.py", line 59, in get_basic_generators
    self.do_split()
File "nnUNet/nnunet/training/network_training/nnUNetTrainerCascadeFullRes.py", line 49, in do_split
    "seg from prev stage missing: %s" % (self.dataset[k]['seg_from_prev_stage_file'])
AssertionError: seg from prev stage missing: /nnUNet/nnUNet_preprocessed/Task00_Test/segs_prev_stage/00000_segFromPrevStage.npz

By looking into the code:

if network == '3d_lowres':
    trainer.load_best_checkpoint(False)
    print("predicting segmentations for the next stage of the cascade")
    predict_next_stage(trainer, join(dataset_directory, trainer.plans['data_identifier'] + "_stage%d" % 1))

and jumping to nnunet/training/cascade_stuff/predict_next_stage.py
I found

def predict_next_stage(trainer, stage_to_be_predicted_folder):
    output_folder = join(pardir(trainer.output_folder), "pred_next_stage")
    maybe_mkdir_p(output_folder)

    process_manager = Pool(2)
    results = []

    for pat in trainer.dataset_val.keys():
        print(pat)
        data_file = trainer.dataset_val[pat]['data_file']
        ......

Here, it seems that the predict_next_stage only processes the validation data, not training data.

For the convenience of comparing the performance of different models, I modified your code to fix the training sample IDs and validation sample IDs when FOLD is all.

I guess I should attribute this problem to using all folds scheme. So there are no other cross-validation sets to form a complete segFromPrevStage set for the next stage.

Is there any way to apply cascade training when previously using all for the lowRes model?

Best,
Jiawei

Inference Stuck

Hi Fabian,

During the inference, all the data have been inferred, but the program gets stuck here.
I only use 3 data for inference, the issue occurred again.

Have you ever had this problem for LiTS?
And what does "separate z: False lowres axis None" mean?

Capture - Copy

Best,
Jun

How to modify or resize tr_gen,val_gen if we use deep_supervision

Hi,

I find there is one parameter to set if we use deep_supervision when initializing the Generic_UNet. But it was default set as False. So I want to know, if I want to set 'self.do_ds' as true, should I at the same time modify the tr_gen, val_gen to give more labels to the seg_outputs from different layers? And also should I modify the loss function to utilize the deep supervision?

Thank you!

Best Regards!

[Request] Configuration for KiTs19'

Hi Fabian

Now, I am trying to reproduce your score in KiTs19 challenge.
I used your nnUnet architecture, augmentations, and other config in my own workflow. Unfortunately, I cannot reproduce your performance.
I would be so grateful if you can update a new version for KiTs challenge.

Thank you so much.

soft dice loss , square numerator and denominator

def soft_dice(net_output, gt, smooth=1., smooth_in_nom=1., square_nominator=False, square_denom=False):
axes = tuple(range(2, len(net_output.size())))
if square_nominator:
intersect = sum_tensor(net_output * gt, axes, keepdim=False)
else:
intersect = sum_tensor((net_output * gt) ** 2, axes, keepdim=False)
if square_denom:
denom = sum_tensor(net_output ** 2 + gt ** 2, axes, keepdim=False)
else:
denom = sum_tensor(net_output + gt, axes, keepdim=False)
result = (- ((2 * intersect + smooth_in_nom) / (denom + smooth))).mean()
return result

if square_nominator is False , the nominator is squared but its not the case with denominator ..is that intended usage ? what is the advantage in this case?

Is that a bug?

Hi fibian,
I found following code in nnUNetTrainer, it may have a bug.

nnUNet/nnunet/network_architecture/generic_UNet.py , line 230~233.
you set lambda x: x, InitWeights_He(1e-2) to final_nonlin

self.network = Generic_UNet(self.num_input_channels, self.base_num_features, self.num_classes, net_numpool,

                                    2, 2, conv_op, norm_op, norm_op_kwargs, dropout_op, dropout_op_kwargs,

                                    net_nonlin, net_nonlin_kwargs, False, False, lambda x: x, InitWeights_He(1e-2),

                                    self.net_num_pool_op_kernel_sizes, self.net_conv_kernel_sizes, False, True, True)

nnUNet/nnunet/network_architecture/generic_UNet.py, line 169~177.

def __init__(self, input_channels, base_num_features, num_classes, num_pool, num_conv_per_stage=2,
                 feat_map_mul_on_downscale=2, conv_op=nn.Conv2d,
                 norm_op=nn.BatchNorm2d, norm_op_kwargs=None,
                 dropout_op=nn.Dropout2d, dropout_op_kwargs=None,
                 nonlin=nn.LeakyReLU, nonlin_kwargs=None, deep_supervision=True, dropout_in_localization=False,
                 final_nonlin=softmax_helper, weightInitializer=InitWeights_He(1e-2), pool_op_kernel_sizes=None,
                 conv_kernel_sizes=None,
                 upscale_logits=False, convolutional_pooling=False, convolutional_upsampling=False,
                 max_num_features=None):

Look forward to your reply
Thanks

how to use this code to run BraTs2018 data?

Hello,Thank you for sharing your code. This project is very meaningful.
my system is ubuntu18.04 and my pytorch version is 1.0.
I'm a beginner, and I want to try to use your code to complete BraTs18 data, but I don't know how to set up the directory of files .How to set the first parameter when I execute Python experiment_planning/plan_and_preprocess_task.py-t TaskXX_MY_DATASET-p Y ?

Interpretation of score

Hi

I Get this score

            2019-08-28 07:32:10.274085:
	epoch:  519
	2019-08-28 07:38:44.771640: train loss : -0.9634
	2019-08-28 07:39:09.508282: val loss (train=False): -0.0622
	2019-08-28 07:39:09.508541: Val glob dc per class: [0.07488450598148061]
	2019-08-28 07:39:10.148665: lr is now (scheduler) 1.2e-05
	2019-08-28 07:39:10.148762: current best_val_eval_criterion_MA is 0.08170
	2019-08-28 07:39:10.148827: current val_eval_criterion_MA is 0.0713
	2019-08-28 07:39:10.148878: No improvement: current train MA -0.9654, best: -0.9663, eps is 0.0005
	2019-08-28 07:39:10.148948: Patience: 19/50
	2019-08-28 07:39:10.149020: This epoch took 419.234268 s

	2019-08-28 07:39:10.149094:
	epoch:  520
	2019-08-28 07:45:44.826182: train loss : -0.9671
	2019-08-28 07:46:09.554083: val loss (train=False): -0.0630
	2019-08-28 07:46:09.554411: Val glob dc per class: [0.07556941617334902]
	2019-08-28 07:46:10.196557: lr is now (scheduler) 1.2e-05
	2019-08-28 07:46:10.196655: current best_val_eval_criterion_MA is 0.08170
	2019-08-28 07:46:10.196733: current val_eval_criterion_MA is 0.0717
	2019-08-28 07:46:10.196869: No improvement: current train MA -0.9655, best: -0.9663, eps is 0.0005
	2019-08-28 07:46:10.196955: Patience: 20/50
	2019-08-28 07:46:10.197019: This epoch took 419.405029 s

As you see the train loss is -96, but the val loss is nowhere close, i haven't predicted on this model yet as it is still training, but does the score make sense to you ?

Thanks.

ur result of LiTS challenge

Hi Fibian:
I want to know how did you get your result on the LiTS challenge leaderboard, are you just using this framework with simple u-net architecture and some ensemble method?
Looking forward your reply, thank you !

MultiThreadedAugmenter crashes

Hi, I am trying to run the following experiment, but it crashes:

python -u run/run_training.py 3d_lowres nnUNetTrainer Task17_challenge2019 0 --ndet

I added export OMP_NUM_THREADS=1 to my .bashrc and also change numpy version to 1.14.5 but it is still unresolved.

stage:  1
{'batch_size': 2, 'num_pool_per_axis': [3, 5, 5], 'patch_size': array([ 48, 192, 192]), 'median_patient_size_in_voxels': array([137, 512, 512]), 'current_spacing': array([3.     , 0.78125, 0.78125]), 'original_spacing': array([3.     , 0.78125, 0.78125]), 'do_dummy_2D_data_aug': True, 'pool_op_kernel_sizes': [[1, 2, 2], [1, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2]], 'conv_kernel_sizes': [[1, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}

I am using stage 0 from these plans
I am using sample dice + CE loss

I am using data from this folder:  /home/mehrtash/nn_unet_data/preprocessing/nnUNet/Task17_challenge2019/nnUNet
###############################################
2019-06-30 09:52:52.244086: unpacking dataset
2019-06-30 09:52:52.374044: done
2019-06-30 09:52:54.762235:
epoch:  0
gpu 0
Exception in worker 1
Traceback (most recent call last):
  File "/home/mehrtash/anaconda3/envs/torch/lib/python3.7/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 46, in producer
    item = next(data_loader)
  File "/home/mehrtash/anaconda3/envs/torch/lib/python3.7/site-packages/batchgenerators/dataloading/data_loader.py", line 126, in __next__
    return self.generate_train_batch()
  File "/home/mehrtash/projects/nnUNet/nnunet/training/dataloading/dataset_loading.py", line 217, in generate_train_batch
    case_all_data = np.load(self._data[i]['data_file'][:-4] + ".npy", self.memmap_mode)
  File "/home/mehrtash/anaconda3/envs/torch/lib/python3.7/site-packages/numpy/lib/npyio.py", line 418, in load
    return format.open_memmap(file, mode=mmap_mode)
  File "/home/mehrtash/anaconda3/envs/torch/lib/python3.7/site-packages/numpy/lib/format.py", line 802, in open_memmap
    mode=mode, offset=offset)
  File "/home/mehrtash/anaconda3/envs/torch/lib/python3.7/site-packages/numpy/core/memmap.py", line 264, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
ValueError: mmap length is greater than file size
Traceback (most recent call last):
  File "run/run_training.py", line 105, in <module>
    trainer.run_training()
  File "/home/mehrtash/projects/nnUNet/nnunet/training/network_training/nnUNetTrainer.py", line 263, in run_training
    super(nnUNetTrainer, self).run_training()
  File "/home/mehrtash/projects/nnUNet/nnunet/training/network_training/network_trainer.py", line 342, in run_training
    l = self.run_iteration(self.tr_gen, True)
  File "/home/mehrtash/projects/nnUNet/nnunet/training/network_training/network_trainer.py", line 509, in run_iteration
    data_dict = next(data_generator)
  File "/home/mehrtash/anaconda3/envs/torch/lib/python3.7/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 190, in __next__
    item = self.__get_next_item()
  File "/home/mehrtash/anaconda3/envs/torch/lib/python3.7/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 172, in __get_next_item
    raise RuntimeError("MultiThreadedAugmenter.abort_event was set, something went wrong. Maybe one of "
RuntimeError: MultiThreadedAugmenter.abort_event was set, something went wrong. Maybe one of your workers crashed

Hi David,

Hi David,
the code was made to run on 12 GB GPUs and an adaptation is not very straightforward. You would have to adjust Generic_UNet.use_this_for_batch_size_computation_3D to something else and then rerun EVERYTHING (especially experiment planning AND preprocessing). Before you do that I recommend you manually edit base_num_features in the plans files (located where the preprocessed data is). Set it to 20 instead of 30 and it should work.
Best,
Fabian

Originally posted by @FabianIsensee in #7 (comment)

trouble with python experiment_planning/plan_and_preprocess_task.py -t Task04_Hippocampus -pl 8

hi FabianIsensee !
when i set base='data',and put data(Task04_Hippocampus)in it and modify

network_training_output_dir = os.path.join(os.environ['RESULTS_FOLDER'], my_output_identifier)

network_training_output_dir = os.path.join(base, my_output_identifier)
run python experiment_planning/plan_and_preprocess_task.py -t Task04_Hippocampus.

it occurs File Not Found Error: [Errno 2],
so i put data(Task04_Hippocampus) to nnUNet_raw_splitted,and no folder created as @JunMa11 showed.
another problem occurs like it
image
image

Question on path setting and data preparation

Dear DKFZ,

Thanks for the great respo. I have some problems on the path setting.

Enviroment: linux, pytorch 1.0
The installing works well.

Data preparation. I download the Task04_Hippocampus dataset from the medical segmentation decathlon, and put it into path/nnUNet/nnunet.

Step 1. Set base = path/nnUNet/nnunet/Task04_Hippocampus.
Step 2. Run python experiment_planning/plan_and_preprocess_task.py -t Task04_Hippocampus, following error occurred:

Traceback (most recent call last):
  File "experiment_planning/plan_and_preprocess_task.py", line 18, in <module>
    from nnunet.paths import splitted_4d_output_dir, cropped_output_dir, preprocessing_output_dir, raw_dataset_dir
  File "/path/nnUNet/nnunet/paths.py", line 51, in <module>
    network_training_output_dir = os.path.join(os.environ['RESULTS_FOLDER'], my_output_identifier)
  File "/home/jma/anaconda3/envs/torch10/lib/python3.6/os.py", line 669, in __getitem__
    raise KeyError(key) from None
KeyError: 'RESULTS_FOLDER'

At the same time, two folders ( nnUNet_raw and nnUNet_raw_splitted) are generated in path/nnUNet/nnunet/Task04_Hippocampus.
I modify network_training_output_dir as

network_training_output_dir = os.path.join(base, my_output_identifier)

Step 3. Besides, I put the Task04_Hippocampus dataset into path/nnUNet/nnunet/Task04_Hippocampus/nnUNet_raw/ and path/nnUNet/nnunet/Task04_Hippocampus/nnUNet_raw_splitted/
but a new error occurred:

Traceback (most recent call last):
  File "experiment_planning/plan_and_preprocess_task.py", line 253, in <module>
    crop(task, override=override, num_threads=processes)
  File "experiment_planning/plan_and_preprocess_task.py", line 131, in crop
    imgcrop.run_cropping(lists, overwrite_existing=override)
  File "path/nnUNet/nnunet/preprocessing/cropping.py", line 203, in run_cropping
    p.map(self._load_crop_save_star, list_of_args)
  File "/home/jma/anaconda3/envs/torch10/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/jma/anaconda3/envs/torch10/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
RuntimeError: Exception thrown in SimpleITK ReadImage: /tmp/SimpleITK/Code/IO/src/sitkImageReaderBase.cxx:99:
sitk::ERROR: The file "path/nnUNet/nnunet/Task04_Hippocampus/nnUNet_raw_splitted/Task04_Hippocampus/imagesTr/hippocampus_367_0000.nii.gz" does not exist.

Question:

I also read the introduction in challenge_dataset_conversion. It well describes how to convert personal dataset to make it compatible with nnU-Net, especially for multi-modality data. nnU-Net was initially developed for MSD challenge, It would be better provide an example for MSD dataset, too. I recommend Task04_Hippocampus, because this dataset is very small.

How to change the CPU core or thread number for on-the-fly data augmentation?

Hi, Fabian,

Thanks a lot for your great contribution!
You have said that it is now not adaptable for the GPU memory. Now I am trying modify it by myself to use more GPUs to train. BUT it seems that the process is always blocked by the augmentation. And I don't find where to use more CPU core to accelerate the augmentation. Could you tell me? Thank you!

No module named 'nnunet'

I download the cadiac dataset from MSD and copy them into base/nnUNet_raw/Task02_Heart.However,when I run preprocessing and experiment planning by executing:python /data/ch/nnUNet/nnunet/experiment_planning/plan_and_preprocess_task.py -t Task02_Heart -p 3,it will occur an error:
File "/data/ch/nnUNet/nnunet/experiment_planning/plan_and_preprocess_task.py", line 14, in <module> from nnunet.experiment_planning.find_classes_in_slice import add_classes_in_slice_info ModuleNotFoundError: No module named 'nnunet'
I checked the directory in nnunet, there is a init.py in nnunet. Why will it occur an error?

train with Task01_BrainTumour has a problem

hi @FabianIsensee
when i do Experiment Planning and Preprocessing to the BRATS data,It occurs a problem
RuntimeError: Exception thrown in SimpleITK ReadImage: /tmp/SimpleITK/Code/IO/src/sitkImageReaderBase.cxx:99: sitk::ERROR: The file "task1_data/nnUNet_raw_splitted/Task01_BrainTumour/imagesTr/BRATS_463_0000.nii.gz" does not exist.
so i change the name in nnUNet_raw_splitted/Task01_BrainTumour/imagesTr to it,then another bug happened like this
Screenshot from 2019-07-02 16-25-06
should i rename the Task01_BrainTumour/imagesTr again like BRATS_164_0001.nii.gz and merge with before.looking forward your reply!
best.

how to set os.environ['nnUNet_base']?

in the paths.py, base = os.environ['nnUNet_base'],is nnUNet_base a path of stored data? If I place data in other folders, such as '/data/ch', is nnUNet _base equal to '/data/ch'?

How to evaluate the test results?

Hi @FabianIsensee , I'm learned and inspired a lot from your work! And I trained on my own data and got some promising results using your code. But there still some questions bothers me:

  1. After training and test phase, how could I get those quantitative metrics results? I try to use the evaluator.py and collect_results_files.py, but they do not work.
  2. I don't know whether the Data with different channels can input into the network for training simultaneously, namely, if i have some data format like(328, 400, 400) and (290, 400, 400), can I train them in the same network?
  3. By the way, it would be wonderful if their have a data augmentation Implementation with nifti or other 3D medical data format.
    Forgive my poor English, have a good day!

Could this repository be used as a baseline with a single RTX 2080 Ti?

Hello @FabianIsensee, congratulations on the results on the no-new-net, really great work.

I wanted to know your opinion on how much of significant research advances in cardiac MRI segmentation—using your work as a baseline—could be accomplished if my lab currently disposes exclusively of a single RTX 2080 Ti GPU.

Any advice on the matter will be well received.

P.S. This is a question-related issue so maybe you could just label this issue as question to help yourself and other users of your repository.

Turn off validation at the end of training

Hi @FabianIsensee ,

When the training is finished, nnUNet will predict and evaluate the validation set automatically.

It's a great setting. However, I need to turn off it during training as my cloud server will encounter memory problem sometimes.

To achieve this, I just need to comment this line, right?

trainer.validate(save_softmax=args.npz, validation_folder_name=val_folder)

Best,
Jun

Question on the function 'validate' of the file 'nnUNetTrainer.py'

Hi, Fabian
Now I am trying the Cascade model on myself data. I got a problems.
Firstly, I used the 3d_lowres model to train the data. The FOLD is just 4. And it work well.
When I used the 3d_cascade_fullres to train, I got an error: ' seg from prev stage missing: ****' . This error was from the function 'do_split' of the file 'nnUNetTrainerCascadeFullRes.py'.
So I checked the code.
I do not undestand the following code in the function 'validate' of the file 'nnUNetTrainer.py':
for k in self.dataset_val.keys():
......
I think 'dataset_val' should be 'dataset'. So the code is like:
for k in self.dataset.keys():
......
I found 'dataset_val.keys' is just one part of the 'dataset.keys'.

I do not know whether it is a bug? Hope you can understand what I said. :)

Best,
Yaliang

PS: I found the same code in the function 'predict_next_stage' of the file 'predict_next_stage.py'
for pat in trainer.dataset_val.keys():
.......

int32 overflow in compute_approx_vram_consumption

Hi Fabian,

First of all - outstanding work! Having read your paper I am excited (and a bit anxious) to test nnUNet on our data set to use it - as suggested - as a baseline for our own 3D-UNet based segmentation architecture.
However, along the way I stumbled across an issue with the plan_and_preprocess_task.py scripts - specifically the get_properties_for_stage method that created a plan with unreasonable values for batch_size (negative) and patch_size (too large: 512x512x309). I traced the problem down to the compute_approx_vram_consumption method where a variable tmp flows over its int32 limit and ends up negative. Converting tmp to numpy.int64 solved the problem for me and now I end up with batch_size=2 and patch_size=96x60x128.

Kind regards,
Bertram

instance segmentation

Any idea how this network may perform on an instance segmentation problem? May be on a 20 class problem?

Thanks

Does the nnunet adapting GPU memory by the spatial GPU?

Hi Fabian,
Your paper said as follows:
All U-Nets are configured to optimally use 12GB Nvidia TitanX GPUs. There is currently no way of adapting to smaller or larger GPUs.
I think that mean, the nnunet using the most memory is 12G.

When I use 24G Nvidia Titan RTX GPU to train the Task10_Colon data, but the nnunet said:
RuntimeError: CUDA out of memory. Tried to allocate 5.63 GiB (GPU 0; 23.76 GiB total capacity; 17.47 GiB already allocated; 5.50 GiB free; 155.50 KiB cached)
I had turned the GPU to the TCC model.
What should I do to resolve this problem?
best,
Yaliang :)

Question about LITS task

Hi, thanks for your amazing work!

May I ask how do 2D, 3D and cascade model perform respectively on LITS dataset? If we don't use ensembling method, which one has better dice result?

I tried to find an answer in your paper but it seems only provide result of 3D model.

Thanks a lot!

Trained model

Hi Fabian,
thanks a lot for making your code public and taking the time to write comprehensible READMEs.
I'm trying to run your code in a Windows system with GTX1080ti GPUs. The code seems to work correctly after changing the 'path.split("/")[-1]' lines for 'os.path.split(path)[1]', or equivalent for other 'path.split("/")'s.
However, I would like to compare my 2D U-Net model to yours. Could I find anywhere your trained 2D U-Net model for the Hippocampus task?
Alternatively, could I find a plot of your loss while training the 2D U-Net model?
Best,
Camila

Suggest to add the description of HOW_MUCH_OF_A_PATIENT_MUST_THE_NETWORK_SEE_AT_STAGE0 in the readme

Hey Fabian!
Thanks for your amazing works!
I tried to use the cascaded mode to test an MRI dataset, which median size is [100, 380, 300]. However, when I use the plan_and_preprocess_task.py to generate the preprocess dataset, I found it does not create a low-resolution dataset for stage 0. I checked the codes and found it was because the default value of HOW_MUCH_OF_A_PATIENT_MUST_THE_NETWORK_SEE_AT_STAGE0 force was set to 4. But if the target object is large, e.g. the wanted receptive region is half of the image, we should tune this value to 2. It would be great if you add its description at Experiment Planning and Preprocessing on the readme.md, so that other people could easily find how to generate the lower-resolution dataset in their own dataset.
Again, thanks for your great works!

regards,
Hao

Reduce memory usage?

Hi!

Thank you for a very interesting framework!

I am trying to run some training on a custom dataset with the 3D Unet and I keep getting CUDA out of memory errors. I am using a Nvidia GeForce GTX 1080 to train the network. Is there anything i can adjust in the code to reduce the batch size or patch size?

Cheers,
David

What's the rapid way to stop data augmentation?

Hi Fabian,

I want to train a model without data augmentation. What's the rapid way to stop data augmentation?

comment these parameters or something else?

default_3D_augmentation_params = {
"selected_data_channels": None,
"selected_seg_channels": None,
"do_elastic": True,
"elastic_deform_alpha": (0., 900.),
"elastic_deform_sigma": (9., 13.),
"do_scaling": True,
"scale_range": (0.85, 1.25),
"do_rotation": True,
"rotation_x": (-15./360 * 2. * np.pi, 15./360 * 2. * np.pi),
"rotation_y": (-15. / 360 * 2. * np.pi, 15. / 360 * 2. * np.pi),

Looking forward to your reply.

Best,
Jun

How to load a pretrain model to nnUNet?

Hi Fabian,

I have trained a model on dataset A. Now, i want to use this to be a pretrain model for another dataset B. Is there any way to do this?

Looking forward to your reply.

Best,
Jiale

Question on overfitting

Dear Fabian,

Thanks for the great work.

I train two 3D fullres U-Nets based on LiTS and my local data (similar to LiTS but only 75 training data).

local dataset
local dataset
LiTS
LiTS

However, the trained U-Nets have overfitting problem.
Q1. Is there any way to ease the overfitting by adjusting parameters?

Q2. nnU-Net default does not use dropout, right?
If yes, have you tried the nnU-Net with dropout?

Best,
Jun

How to train the network on own dataset?

I want to train the network on own dataset and evaluate the performance of nnUnet for comparing. But I find too many folds in your source code. How can I train your code on own data?
Thanks!

About inferece result of the cascade model

Hi Fabian,
I have a question when inference data use the cascade model.
I used the cascade model to train my data (left kidney data).
The training commands:
python run/run_training.py 3d_lowres nnUNetTrainer Task50_LeftKidney all --ndet
python run/run_training.py 3d_cascade_fullres nnUNetTrainerCascadeFullRes Task50_LeftKidney all --ndet

The inference commands:(on the Windows10)
python inference/predict_simple.py -f all -i Task50_LeftKidney\imagesTs-0000 -o Task50_LeftKidney\cascade\lowres -t Task50_LeftKidney -tr nnUNetTrainer -m 3d_lowres

python inference/predict_simple.py -f all -i Task50_LeftKidney\imagesTs-0000 -o Task50_LeftKidney\cascade\final -t Task50_LeftKidney -tr nnUNetTrainerCascadeFullRes -m 3d_cascade_fullres -l Task50_LeftKidney\cascade\lowres

But the inference result is strange.
The DICE of the '3d_lowres' is 0.91,
the DICE of the '3d_cascade_fullres' is 0.47, so badly :(
(PS: the DICE of the '3d_fullres' is 0.76)

I think the 'fullres' should be better than the 'lowres'.
What do you think?
Best,
Yaliang

Should the original patient image be square?

HI, Fabian,

I am now using my dataset, but I don't know if the image shape must be square? For example, if I have image shape like (100, 601, 202),corresponding (channel, x , y), must I pad it to be (100,601,601)? Will the preprocess script handle it automatically, or the network could accept non-square shape images?

Thank you!

When set 'fold=all', how will the model to be trained?

Hi, Fabian:

You have said that the model by default to be trained in 5-fold cross-validation, I think it means if we set fold=number, then it will choose one kind of data_split, 1/5 data to be used as validation without being trained. Then if we run all the 5 different split in the 5-fold cross-validation, we should save 5 models.

But I want to know if we set 'fold=all', how will it use the training data to get one model? I meant how it will split the data for validation and training.

Thank you!

Training with LiTS Dataset

I used the network architecture of Full res 3d-Unet from ur repo
norm_op_kwargs = {'eps': 1e-5, 'affine': True}
dropout_op_kwargs = {'p': 0.5, 'inplace': True}
net_nonlin = nn.LeakyReLU
net_nonlin_kwargs = {'negative_slope': 1e-2, 'inplace': True}
net = Generic_UNet(input_channels=1, base_num_features=30, num_classes=2, num_pool=4, num_conv_per_stage=2,
feat_map_mul_on_downscale=2, conv_op=nn.Conv3d,
norm_op=nn.BatchNorm3d, norm_op_kwargs=norm_op_kwargs,
dropout_op=nn.Dropout3d, dropout_op_kwargs=dropout_op_kwargs,
nonlin=nn.LeakyReLU, nonlin_kwargs=net_nonlin_kwargs, deep_supervision=False)

and created my own dataset by resampling and getting a random crop of 128^3 as input size while training with liver and lesion masks for prediction and used same optimiser and scheduler settings .Even after training 100k iterations with extensive data augmentations and lr reducing to the floor , the training dice accuracy for lesions did not even increase from 0.01 dice , only liver accuracy was increasing through out and stopped at 0.5 dice . i tried combinations of dice loss , cross entropy and both . Am i missing anything here , i know it is hard to tell with out looking into my code . is there any general advice ,tips on how to balance the loss ??

Experiment plans

Thank you very much for sharing your code, it is awesome and your comments also made me laugh.

I would like to ask if you could share the experiment plan files for the experiments presented in the MICCAI 2019 nnUnet paper. It may help us who try to replicate your results without the need to use the whole nnUnet structure.

Thank you very much!

Does this code of nnUNet works well?

Hi FabianIsensee:
An incredible and outstanding work!~~~,you are the best researcher of segmentation area,did your still remember me?Xinyu is me.
Oh my god, when i read your paper of nnUNet tonight,it makes me feel exciting!~.
It will be the strong baseline of unet
Does it works now ?

Best
Xinyu

about training model evaluation

Hi,Fabian. During training, I find some information:

==============================================================

2019-08-05 21:29:11.999288: val loss (train=False): -0.4625
2019-08-05 21:29:12.180008: Val glob dc per class: [0.9191570267380906, 0.5505475351815083]
2019-08-05 21:29:13.074796: lr is now (scheduler) 0.0003
2019-08-05 21:29:13.161585: current best_val_eval_criterion_MA is 0.75620
2019-08-05 21:29:13.226852: current val_eval_criterion_MA is 0.7540
2019-08-05 21:29:13.259483: New best epoch (train loss MA): -0.4183
2019-08-05 21:29:13.300148: Patience: 0/50
2019-08-05 21:29:13.347112: This epoch took 1807.986493 s

================================================================
which information can indicate the quality of the model? Val glob dc per class? or current best_val_eval_criterion_MA? And my understanding is that the higher the best_val_eval_criterion_MA , the better the model.

Thanks~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.