project-monai / tutorials Goto Github PK

View Code? Open in Web Editor NEW

1.7K 32.0 645.0 272.2 MB

MONAI Tutorials

Home Page: https://monai.io/started.html

License: Apache License 2.0

Jupyter Notebook 98.60% Python 1.28% Shell 0.11% Dockerfile 0.01%

monai monai-tutorials pytorch jupyter-notebook monai-workflows

tutorials's People

Stargazers

Watchers

Forkers

rijobro matteomaspero yuantinghsieh neuronflow suprimnakarmi saruarlive cindyqi7788 dzenanz bjz205588 mkvarun owkin johnnie21 imsugeno wyli ericspod dootmaan dancebean fuzzythecat ragprog podismine arthur1511 s-shailja francescolr leong1230 hiyuhan hhhhhscott saeedseyyedi chenefei1003 sixitingting song-a koide-lab sekhar101 rrwww javierberna ronakkaoshik42 drbeh reyn4bo cklee19800303 zxyskyfly sandbornm sajalroychowdhury kwxu krishnarastogi nan-hk amulmgr mormona voldet jttecson roijo sunyeoplee ashokohio llockhar danielschulz staffantackstrom cheikhdjennel simaoppcastro avain bradleyerickson-flowsigma deepmd-io bonbonpapa suprosanna kate-sann5100 elizavwp dgidgidgi aki-wada phillipchoi007 abedygathaba bartth madhu081096 sushma1125 mfernezir adamaji nianweijie hhtsai-ntu yeechingtiger cbe135 hugowww explcre adamwu1979 antoine-ls matteobe prashulsingh whsu2s dianemarquette vigsivan siyun-jung allenjwzhu foresterhema nicolizamacorrea yellowsimulator jpcenteno80 archietram helwilliams kqdhx anupriya-4 newcooldiscoveries raoufartikodin ziyuexu77 nabeel-penkar7 edwinlzw

tutorials's Issues

MONAI flags: HAS_EXT = False, USE_COMPILED = False

What does this line meaning??
I wanted to reproduce the baseline lesion network
Every time I started training, this line occured. Is there something wrong??

double check all examples/tutorials

since there're breaking changes since v0.2, we need to rerun and double-check all the examples/tutorials for v0.3

How to modify the loss function as Dice + CE loss?

Hi,
I am conducting a segmentation task with only one target structure. Now I try to modify the loss function as Dice + CE loss, then I just change the code as shown here,

class CrossEntropyLoss(nn.Module):
    def __init__(self):
        super().__init__()
        self.loss = nn.CrossEntropyLoss()

    def forward(self, y_pred, y_true):
        # CrossEntropyLoss target needs to have shape (B, D, H, W)
        # Target from pipeline has shape (B, 1, D, H, W)
        y_true = torch.squeeze(y_true, dim=1).long()
        return self.loss(y_pred, y_true)


class DiceCELoss(nn.Module):
    def __init__(self):
        super().__init__()
        self.dice = monai.losses.DiceLoss(sigmoid = True)
        self.cross_entropy = CrossEntropyLoss()

    def forward(self, y_pred, y_true):
        dice = self.dice(y_pred, y_true)
        cross_entropy = self.cross_entropy(y_pred, y_true)
        return dice + cross_entropy

loss_function = DiceCELoss()

However, after changing this part, the code seems cannot work now, and reports such kind of error. Could you please help me find what is wrong?

/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [893,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [894,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [895,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [384,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
  File "2D_UNet.py", line 259, in <module>
    main()
  File "2D_UNet.py", line 191, in main
    loss.backward()
  File "/home/anaconda3/envs/monai/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/anaconda3/envs/monai/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

Thanks a lot.

tikinter issue: RuntimeError: main thread is not in main loop

Describe the bug
I got a tkinter runtime error related with threads when locally running the spleen_segmentation_3d.ipynb in the epoch cell.

epoch 12/600
1/16, train_loss: 0.5761
2/16, train_loss: 0.5969
3/16, train_loss: 0.4487
4/16, train_loss: 0.5615
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
5/16, train_loss: 0.5464
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
  File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Traceback (most recent call last):
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 872, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 107, in get
    if not self._poll(timeout):
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 257, in poll
    return self._poll(timeout)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 424, in _poll
    r = wait([self], timeout)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 931, in wait
    ready = selector.select(timeout)
  File "/usr/lib/python3.8/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 355179) is killed by signal: Aborted.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "spleen_segmentation_3d.py", line 268, in <module>
    for batch_data in train_loader:
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data
    idx, data = self._get_data()
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1034, in _get_data
    success, data = self._try_get_data()
  File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 885, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 355179) exited unexpectedly

Environment (please complete the following information):

OS: Linux (arch)
Python version: 3.8.6
MONAI version [e.g. git commit hash] 0.4.0
CUDA/cuDNN version: 11.1
GPU models and configuration: 3090

Additional context
Sorry for the brevitiy of the report. The notebook is run as a python script using jupytext (converts ipynb to py).

Runtime error in the Code. I am using Colab with reduced data in train and validation set. And i set the epoch to 100. But this happens

INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 5/100, Iter: 21/66 -- train_loss: 0.8563
ERROR:ignite.engine.engine.SupervisedTrainer:Current run is terminating due to exception: DataLoader worker (pid 1476) is killed by signal: Killed. .
ERROR:ignite.engine.engine.SupervisedTrainer:Exception: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.
ERROR:ignite.engine.engine.SupervisedTrainer:Engine run is terminating due to exception: DataLoader worker (pid 1476) is killed by signal: Killed. .
ERROR:ignite.engine.engine.SupervisedTrainer:Exception: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 730, in _internal_run
time_taken = self._run_once_on_dataset()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 828, in _run_once_on_dataset
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "run_net.py", line 301, in
train(data_folder=data_folder, model_folder=args.model_folder)
File "run_net.py", line 211, in train
trainer.run()
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 46, in run
super().run()
File "/usr/local/lib/python3.6/dist-packages/monai/engines/workflow.py", line 163, in run
super().run(data=self.data_loader, max_epochs=self.state.max_epochs)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 691, in run
return self._internal_run()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 762, in _internal_run
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 730, in _internal_run
time_taken = self._run_once_on_dataset()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 828, in _run_once_on_dataset
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.

verify notebooks using the latest monai 0.4.0 pypi release

subtask of Project-MONAI/MONAI#1318
need to

check that pip install monai[all] works for all the notebooks and demos
tag a 0.4.0 version of this repo

WASSERTEIN distance

Hello, I wanted to ask if anyone has used Wasserstein distance in brain different structures segmentation because I have some issues. For example, the argument that I should pass to my pipeline is a matrix distance and I would like to know how I construct this matrix, I don't know where those numbers come from. And the other question if is the loss of Wasserstein distance is finished and prove in any experiment (not in brain tumor because the labels are continuous and my labels are separated).

Thank you.

Model fine tuning

I trained the spleen segmentation model for 200 epochs with the decathlon database. Then I evaluated it with my own dataset and the segmentation performance was extremely poor, do you know how I can finetune the model parameters with my own dataset? how should I do that and how many epochs should I do? (My dataset comprehend 20 manually segmented spleens)

Thanks
Aymen

Crashed (Baseline Unet model training for Covid-19 sementation challenge)

(pytorch) rasho@rasho-WS-E500-G5-WS690T:~/covid-19_3D_Segmentation$ python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"
MONAI version: 0.3.0+81.g62b0bbb
Python version: 3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0]
OS version: Linux (5.4.0-53-generic)
Numpy version: 1.19.2
Pytorch version: 1.7.0
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.2.0
scikit-image version: NOT INSTALLED or UNKNOWN VERSION.
Pillow version: 8.0.1
Tensorboard version: NOT INSTALLED or UNKNOWN VERSION.
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: 0.8.1
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.52.0

For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100%|█████████████████████████████████████████████████████████████████████| 160/160 [04:13<00:00, 1.58s/it]
Load and cache transformed data: 100%|███████████████████████████████████████████████████████████████████████| 39/39 [00:58<00:00, 1.51s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
INFO:root:epochs 500, lr 0.0001, momentum 0.95
INFO:ignite.engine.engine.SupervisedTrainer:Engine run resuming from iteration 0, epoch 0 until 500 epochs
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 1/80 -- train_loss: 1.4053
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 2/80 -- train_loss: 1.3833
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 3/80 -- train_loss: 1.3598
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 4/80 -- train_loss: 1.3268
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 5/80 -- train_loss: 1.3438
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 6/80 -- train_loss: 1.3146
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 7/80 -- train_loss: 1.3164
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 8/80 -- train_loss: 1.3118
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 9/80 -- train_loss: 1.2970
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 10/80 -- train_loss: 1.2957
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 11/80 -- train_loss: 1.2779
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 12/80 -- train_loss: 1.2499
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 13/80 -- train_loss: 1.2641
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 14/80 -- train_loss: 1.2634
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 15/80 -- train_loss: 1.2439
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 16/80 -- train_loss: 1.2206
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 17/80 -- train_loss: 1.2209
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 18/80 -- train_loss: 1.2143
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 19/80 -- train_loss: 1.1976
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 20/80 -- train_loss: 1.1950
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 21/80 -- train_loss: 1.1833
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 22/80 -- train_loss: 1.1747
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 23/80 -- train_loss: 1.1739
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 24/80 -- train_loss: 1.1676
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 25/80 -- train_loss: 1.1586
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 26/80 -- train_loss: 1.1585
Killed

3d classification no data download

Describe the bug
All 3D classification tutorials assume that the data is in 'workspace/data/medical/ixi/IXI-T1/, but none do the download.

To Reproduce

Load any 3D classification tutorial and run.

Expected behavior
Tutorial should be able to run the whole way through without user intervention.

Environment (please complete the following information):
N/A

Crash in spleen 3D segmentation tutorial

Describe the bug
Trying to follow https://github.com/Project-MONAI/tutorials/blob/17bf2ec91e2871898198084f4ba5e968c2bef47e/3d_segmentation/spleen_segmentation_3d.ipynb I run into a traceback at step "Execute a typical PyTorch training process".

To Reproduce
I installed everything using pip, in a virtual environment. I needed to allow CPU back-end, as my laptop has AMD GPU: device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Screenshots

----------
epoch 1/600
---------------------------------------------------------------------------
PicklingError                             Traceback (most recent call last)
<ipython-input-13-ab25791c97e3> in <module>
     12     epoch_loss = 0
     13     step = 0
---> 14     for batch_data in train_loader:
     15         step += 1
     16         inputs, labels = (

c:\dev\monai\pyenv\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
    277             return _SingleProcessDataLoaderIter(self)
    278         else:
--> 279             return _MultiProcessingDataLoaderIter(self)
    280 
    281     @property

c:\dev\monai\pyenv\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
    717             #     before it starts, and __del__ tries to join but will get:
    718             #     AssertionError: can only join a started process.
--> 719             w.start()
    720             self._index_queues.append(index_queue)
    721             self._workers.append(w)

C:\Dev\Python3.7.9\lib\multiprocessing\process.py in start(self)
    110                'daemonic processes are not allowed to have children'
    111         _cleanup()
--> 112         self._popen = self._Popen(self)
    113         self._sentinel = self._popen.sentinel
    114         # Avoid a refcycle if the target function holds an indirect

C:\Dev\Python3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    221     @staticmethod
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 
    225 class DefaultContext(BaseContext):

C:\Dev\Python3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 
    324     class SpawnContext(BaseContext):

C:\Dev\Python3.7.9\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     87             try:
     88                 reduction.dump(prep_data, to_child)
---> 89                 reduction.dump(process_obj, to_child)
     90             finally:
     91                 set_spawning_popen(None)

C:\Dev\Python3.7.9\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

PicklingError: Can't pickle <function CropForegroundd.<lambda> at 0x000001E8B1AED318>: attribute lookup CropForegroundd.<lambda> on monai.transforms.croppad.dictionary failed

Environment (please complete the following information):
Windows 10
MONAI version: 0.2.0
Python version: 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)]
Numpy version: 1.19.1
Pytorch version: 1.4.0+cpu

Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.17.2
Pillow version: 7.2.0
Tensorboard version: 2.3.0

Update to use LoadImage transform

Is your feature request related to a problem? Please describe.
As we updated LoadImage as the recommended loading method, need to update all the examples and tutorials.

Pip installs at start of each notebook?

Should we be pip installing monai and its dependencies at the start of each notebook?

Discussion continued from #47.

My personal feeling is that we should lift all pip installs from our notebooks as the relevant instructions are already in our README.md. It also saves us from having to update as our notebooks/dependencies change.

Error adapting Spleen example to different shaped dataset

Describe the bug
I am trying to adapt the spleen_segmentation_3d.ipynb notebook to imaging data with a slightly different shape. The images in the Spleen set are 226x257 with 113 planes in the stack. My images are 1200x340 with 20 planes in the stack. The notebook samples the data in cubes of size 96x96x96. To get the example notebook to work, I have to duplicate my data on the planes to be 20+20+20+20+16 = 96. Otherwise it breaks, for the obvious reason that you can't get 96 slices out of 20.

Suppose however that I change the cube size to 20x20x20, so I don't duplicate planes to match the exact setup of the notebook. I still get a problem. Here is the problem, please let me know how to resolve it:

RuntimeError                              Traceback (most recent call last)
<ipython-input-15-26b65d7e4120> in <module>
     19         )
     20         optimizer.zero_grad()
---> 21         outputs = model(inputs)
     22         loss = loss_function(outputs, labels)
     23         loss.backward()

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/nets/unet.py in forward(self, x)
    190 
    191     def forward(self, x: torch.Tensor) -> torch.Tensor:
--> 192         x = self.model(x)
    193         return x
    194 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
     37 
     38     def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39         return torch.cat([x, self.submodule(x)], self.cat_dim)
     40 
     41 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
     37 
     38     def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39         return torch.cat([x, self.submodule(x)], self.cat_dim)
     40 
     41 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
     37 
     38     def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39         return torch.cat([x, self.submodule(x)], self.cat_dim)
     40 
     41 

RuntimeError: Sizes of tensors must match except in dimension 2. Got 4 and 3

To Reproduce
Here is the code:

import glob, os, torch
from monai.data import CacheDataset, DataLoader, Dataset
from monai.inferers import sliding_window_inference
from monai.losses import DiceLoss
from monai.metrics import compute_meandice
from monai.networks.layers import Norm
from monai.networks.nets import UNet
from monai.utils import first, set_determinism
data_dir='nf1_monai'
os.environ['MONAI_DATA_DIRECTORY']=data_dir
directory = os.environ.get("MONAI_DATA_DIRECTORY")
root_dir = directory
train_images = sorted(glob.glob(os.path.join(data_dir, "imagesTr", "*.npy")))
train_labels = sorted(glob.glob(os.path.join(data_dir, "labelsTr", "*.npy")))
data_dicts = [
    {"image": image_name, "label": label_name}
    for image_name, label_name in zip(train_images, train_labels)
]
train_files, val_files = data_dicts[:-10], data_dicts[-10:]
set_determinism(seed=0)
from monai.transforms import (
    AddChanneld,
    Compose,
    LoadNumpyd,
    RandCropByPosNegLabeld,
    ToTensord,
)
train_transforms = Compose(
    [
        LoadNumpyd(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        RandCropByPosNegLabeld(
            keys=["image", "label"],
            label_key="label",
            spatial_size=(20,20,20),
            pos=1,
            neg=1,
            num_samples=4,
            image_key="image",
            image_threshold=0,
        ),
        ToTensord(keys=["image", "label"]),
    ]
)
val_transforms = Compose(
    [
        LoadNumpyd(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        ToTensord(keys=["image", "label"]),
    ]
)
device = torch.device("cuda:0")
model = UNet(
    dimensions=3,
    in_channels=1,
    out_channels=2,
    channels=(16, 32, 64, 128, 256),
    strides=(2, 2, 2, 2),
    num_res_units=2,
    norm=Norm.BATCH,
).to(device)
loss_function = DiceLoss(to_onehot_y=True, softmax=True)
optimizer = torch.optim.Adam(model.parameters(), 1e-4)
train_ds = CacheDataset(data=train_files, transform=train_transforms, cache_rate=1.0, num_workers=1)
train_loader = DataLoader(train_ds, batch_size=6, shuffle=True, num_workers=16)
val_ds = CacheDataset(data=val_files, transform=val_transforms, cache_rate=1.0, num_workers=16)
val_loader = DataLoader(val_ds, batch_size=1, num_workers=1)
epoch_num = 600
val_interval = 2
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = list()
metric_values = list()
for epoch in range(epoch_num):
    print("-" * 10)
    print(f"epoch {epoch + 1}/{epoch_num}")
    model.train()
    epoch_loss = 0
    step = 0
    for batch_data in train_loader:
        step += 1
        inputs, labels = (
            batch_data["image"].to(device),
            batch_data["label"].to(device),
        )
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = loss_function(outputs, labels)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
        print(f"{step}/{len(train_ds) // train_loader.batch_size}, train_loss: {loss.item():.4f}")
    epoch_loss /= step
    epoch_loss_values.append(epoch_loss)
    print(f"epoch {epoch + 1} average loss: {epoch_loss:.4f}")

Expected behavior
The UNet should train and not break.

Environment (please complete the following information):
OS: Ubuntu 20.04LTS
MONAI version: 0.3.0
Python version: 3.8.2 (default, Mar 26 2020, 15:53:00) [GCC 7.3.0]
OS version: Linux (5.4.0-52-generic)
Numpy version: 1.18.1
Pytorch version: 1.5.0
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.0
scikit-image version: 0.16.2
Pillow version: 7.1.2
Tensorboard version: 2.2.1
gdown version: 3.12.2
TorchVision version: 0.6.0a0+82fd1c8
ITK version: 5.1.1
tqdm version: 4.50.2

Additional context
I am trying to do tumor detection on whole-body MRI scans. The tumors are small and the body is large. So far this is giving me an average F1 score of 0.17 using this library, training with the 20+20+20+20+16 stacking workaround.

How to handle RGB 2D images

Describe the bug
I just tried running the 2D segmentation tutorial, but on my own 2D images (a mixed dataset of TIFF, PNG, JPG and BMP images). I ran into several problems, e.g. LoadPNGd cannot handle TIFF images, the rest of the transform pipeline throws an error (I think PIL loads TIF images in a different way than others - I usually use skimage.io, which always returns a numpy array). The biggest problem though is that the transforms pipeline cannot handle the color channel in RGB images, or I am doing sth wrong when applying the Resized() transform - the latter is necessary because I need images at a fixed size of 320x240 at the end of the transform pipeline.

To Reproduce
Put a few RGB color images (maybe including at least one TIFF image ;) into a directory, then set up a simple transform pipeline like this:

train_transforms = Compose(
    [
        LoadImaged(keys=["img"]),
        LoadNumpyd(keys=["seg"]), # my segs are four channels stored as numpy array, of shape (height,width,4)
        ScaleIntensityd(keys="img"),
        Resized(keys=["img", "seg"], spatial_size=(240,320), mode='bilinear', align_corners=True),
        RandFlipd(keys=["img", "seg"], prob=0.5),
        ToTensord(keys=["img", "seg"]),
    ]
)

Then, to check the shape of the output tensors:

# define check dataset, check data loader
check_ds = monai.data.Dataset(data=train_files, transform=train_transforms)
# use batch_size=2 to load images and use RandCropByPosNegLabeld to generate 2 x 4 images for network training
check_loader = DataLoader(check_ds, batch_size=2, num_workers=4, collate_fn=list_data_collate)
check_data = monai.utils.misc.first(check_loader)
print(check_data["img"].shape, check_data["seg"].shape)
plt.imshow(np.squeeze(check_data["img"][0,0,:,:]))

Expected behavior
If the color channel is handled correctly, I expect the shape of the tensors to be [2,3,240,320].

Observed behavior
The output shape is [2,300,240,320] (please note that in my case, monai.utils.misc.first(check_loader) loads an image of shape [300,400,3]).

Environment (please complete the following information):

Ubuntu 18.04
MONAI version: 0.2.0+166.g12b3fbf
Python version: 3.6.10 |Anaconda, Inc.| (default, May 8 2020, 02:54:21) [GCC 7.3.0]
Numpy version: 1.19.1
Pytorch version: 1.7.0a0+8deb4fe
Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.15.0
Pillow version: 7.2.0
Tensorboard version: 1.15.0+nv
gdown version: 3.12.2
TorchVision version: 0.8.0a0
ITK version: 5.1.1

missing colab button and install part in ThreadBuffer notebook

Describe the bug
Hi @ericspod , could you please help add the Colab button and installation from latest MONAI code(as we haven't released 0.4 yet) to the ThreadBuffer notebook?
Thanks.

ValueError while training the model for Brain Tumor Segmentation

Hi there -

I'm new to MONAI and doing some learning of the brain tumor segmentation code - referring to the file brats_segmentation_3d.ipynb under tutorials/3d_segmentation. I'm using this code AS-IS in my Jupyter Notebook. While training on the Medical Decathlon dataset, exactly after epoch 2 I see the following error:

_epoch 2 average loss: 0.8960

ValueError Traceback (most recent call last)
in
54 # metric_sum += value.item() * not_nans
55 # compute mean dice for TC
---> 56 value_tc, not_nans = dice_metric(y_pred=val_outputs[:, 0:1], y=val_labels[:, 0:1])
57 not_nans = not_nans.item()
58 metric_count_tc += not_nans

ValueError: not enough values to unpack (expected 2, got 1)_

Can you please suggest anything to rectify this problem?

Many thanks,
Sekhar H.

ITK version: NOT INSTALLED or UNKNOWN VERSION

Dear all,

After pip installing ITK or SimpleITK the "print_config()" prompt does not find the installed ITK version.
Moreover, while executing the "densenet_training.array.ipynb" tutorial I get this error:

Load and cache transformed data: 0%| | 0/9 [00:00<?, ?it/s]

OptionalImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/monai/transforms/utils.py in apply_transform(transform, data, map_items)
308 return [transform(item) for item in data]
--> 309 return transform(data)
310 except Exception as e:

35 frames
OptionalImportError: import itk (No module named 'itk').

For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

During handling of the above exception, another exception occurred:

OptionalImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/monai/utils/module.py in optional_import(module, version, version_checker, name, descriptor, version_args, allow_namespace_pkg)
165 actual_cmd = f"import {module}"
166 try:
--> 167 pkg = import(module) # top level module
168 the_module = import_module(module)
169 if not allow_namespace_pkg:

OptionalImportError: Applying transform <monai.transforms.io.dictionary.LoadImaged object at 0x7f7d33d66860>.

Regards,
Sebastian

[Question] augmentation tutorial, how to deal with labels?

I am trying to implement torchio and batchgenerators augmentations following this tutorial:
https://github.com/Project-MONAI/Tutorials/blob/master/integrate_3rd_party_transforms.ipynb

The spatial transformations should also affect my label maps, however I don't want to use linear or bspline interpolation which makes sense for image data for my label maps. What is the best way to implement that?

3D Classifier

Dear all,

I'm adapting the 3D classifier tutorial "densenet_training" to my example files.
My nifti files have a different size, so I get this error when doing the input to the model:

Expected 5-dimensional input for 5-dimensional weight [64, 1, 7, 7, 7], but got 4-dimensional input of size [2, 224, 224, 160] instead

How can I modify the code so I can test the tutorial on my files?

Thanks

fixing the conda env

Is your feature request related to a problem? Please describe.
would be great to fix the Anaconda Python distribution with a predefined yml file, such as
https://github.com/Project-MONAI/MONAIBootcamp2020#instal-local-environment

automate the testing of the jupyter notebooks

this ticket looks for an automated CI setup to ensure the quality of the notebooks.
see also discussions:

The Code used for the training stops

!python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"

Code cell stops after showing this result. Don't know what should I do next, How can I find the models?

MONAI version: 0.3.0+87.ge94e243
Python version: 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.53.0

For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100% 160/160 [05:30<00:00, 2.06s/it]
Load and cache transformed data: 100% 39/39 [01:24<00:00, 2.18s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
^C

Explicit for-loop optimisation or SupervisedTrainer

It seems that in the majority of tutorials, the optimisation for loop is given explicitly. In relatively few places, the SupervisedTrainer is used, despite existing for this reason.

I can see why having the explicit for loop is beneficial for tutorials - so that people are more aware of the inner workings. However, for the sake of conciseness, I would be in favour of having just one notebook (named suitably) in which the explicit for loop is given, and then from there on, using the SupervisedTrainer. Notebooks using SupervisedTrainer could then refer to the explicit notebook.

I think @ericspod is in favour of leaving the notebooks as they are, so as not to hide anything (which I understand). Anyone else have an opinion?

Log and plotting result for training proccess

Thank you for the tutorials!

I can't find any log files saved in ./run and it seems that this part is not included in the code. (./3d_segmentation/baseline)
It would be much clearer if the training information is saved and plotted.

Anthor question is that, do the images under 'Validation' folder have labels (groung truth?) and where is it?

Thank you!

Adopt PEP8 in MONAI/examples/notebooks/

Is your feature request related to a problem? Please describe.
Some notebooks are not following the PEP8 style guide.

Describe the solution you'd like
Please, consider following the PEP8 style guide in the notebooks from MONAI/examples/notebooks/.
For example, in examples/notebooks/mednist_tutorial.ipynb, cell 4 has variables named using the CamelCase style instead snake_case (https://www.python.org/dev/peps/pep-0008/#id45), for example:

dataDir = './MedNIST/'
classNames = os.listdir(dataDir)
numClass = len(classNames)

Later, in the same notebook, the snake_case is adopted.

train_ds = MedNISTDataset(trainX, trainY, train_transforms)
train_loader = DataLoader(train_ds, batch_size=300, shuffle=True, num_workers=10)

val_ds = MedNISTDataset(valX, valY, val_transforms)
val_loader = DataLoader(val_ds, batch_size=300, num_workers=10)

covid challenge evaluation

hello, I am the participant of the covid challenge. Now, the submit has been closed. I have a new prediction and I want to know its dice score to do my own research. Could you please open the evaluation website for me or share the evaluation method? It will be better if the organization can release the ground truth labels of test and validation dataset. Thank you so much!

Develop a tutorial about how to develop networks based on MONAI APIs

Is your feature request related to a problem? Please describe.
We have very rich network layers, blocks, etc. and support both 2D and 3D, we also have layer factory to generate common layers. But currently, we don't have a step by step tutorial to show how to use the APIs to develop networks.

Colab links point to main respository

Tutorials have a link for opening them with Colab:

But this points to their location prior to being moved into a separate repository:

review notebooks with reviewnb

install reviewnb https://www.reviewnb.com/ on this repo for diff & Commenting pull requests of jupyter notebooks

AttributeError: module 'monai.networks.nets' has no attribute 'BasicUNet'

Describe the bug
The code from A U-Net model for lung lesion segmentation from CT images could not be runned.

The error information is: TypeError: __init__() got an unexpected keyword argument 'to_onehot_y'.

To Reproduce
Steps to reproduce the behavior:

Go to https://github.com/Project-MONAI/tutorials/tree/master/3d_segmentation/challenge_baseline/
Install MONAI by pip install monai. NOTE: this is important. Different install methods lead to different errors #60
Run commands python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"

Expected behavior
Start training of the model.

Screenshots

Environment (please complete the following information):

OS: CentOS
Python version: 3.6
MONAI version [e.g. git commit hash]: 0.3.0
CUDA/cuDNN version:
GPU models and configuration

Additional context
https://covid-segmentation.grand-challenge.org/Resource/

Quality assurance of the MONAI/examples folder

Is your feature request related to a problem? Please describe.
As the size and content scope of the MONAI/examples folder increase,
it's necessary to figure out the hardware/software requirements for running the examples,
and also provide some forms of quality assurance of the example codes.

Describe the solution you'd like
could automatically run the examples as a part of the automated CI/CD pipeline?

Describe alternatives you've considered
manually verifying all the examples regularly (tedious and error-prone)

Additional context
see also https://github.com/Project-MONAI/MONAI/issues/296

ROI size sliding window

Hi,
I was wondering how UNet deals with the sliding window input.
Because the ROI you set is bigger than the patches UNet is trained on.
How does this work?

Thanks.
Kirsten

TypeError: init() got an unexpected keyword argument 'to_onehot_y'

Describe the bug
The code from A U-Net model for lung lesion segmentation from CT images could not be runned.

The error information is: TypeError: __init__() got an unexpected keyword argument 'to_onehot_y'.

To Reproduce
Steps to reproduce the behavior:

Go to https://github.com/Project-MONAI/tutorials/tree/master/3d_segmentation/challenge_baseline/
Install MONAI by pip install "git+https://github.com/Project-MONAI/MONAI#egg=monai[nibabel,ignite,tqdm]"
Run commands python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"

Expected behavior
Start training of the model.

Screenshots

Environment (please complete the following information):

OS: CentOS
Python version: 3.6
MONAI version [e.g. git commit hash]: 0.3.0
CUDA/cuDNN version:
GPU models and configuration

Additional context
https://covid-segmentation.grand-challenge.org/Resource/

a tutorial of GradCAM

GradCAM module is inplace, would be great to have a 3D classification model demo, with a medical image related task

Project-MONAI/MONAI#1303

LR SCHEDULER and LR Finder

It would be nice to have examples on how to use Learning rate schedulers using the MONAI classes. And It would be nice to have a LR finder like the FASTAI one cycle one.

Tutorial for the image readers and the LoadImage transform

As the experimental new APIs have been implemented for MONAI I/O Project-MONAI/MONAI#909
this ticket looks for a tutorial to show that:

the image reader APIs could be used independently as file format specific loaders
LoadImage transform could be used as a format-agnostic module, typically as the first transform in a 'transform chain'

(might be useful to briefly mention the optional import feature of MONAI)

mkdtemp missing brackets

In a few places (e.g., 3rd cell here), we have:

root_dir = tempfile.mkdtemp if directory is None else directory

which is missing the brackets:

root_dir = tempfile.mkdtemp() if directory is None else directory

Might be worth grepping and replacing all mkdtemp[space] with mkdtemp()[space].

rename the modules/workflows folder to modules/engines

the folder mainly demonstrate MONAI's engines and handlers implementation
cc @pfjaeger @zephyrie @Nic-Ma

Soft labels to reflect uncertainty on boundary of ground truth annotation

Hey all,

I am currently using monai to participate in the grand challenge for COVID segmentation. As a baseline model I use the DynUnet with parameters adapted from nn-Net. This works great and gave me a validation dice of around 0.7. To further improve the results I wanted to focus on handeling the noisy ground truth annotations. Since the ground truth annotations from this project are not really clean, I want to implement some form of 'soft labels'. By gaussian smoothing the masks, the probability drops below 1 on the borders of the lesions reflecting the uncertainty of the ground truth annotation.

I tried implementing this with monai building blocks, but I got stuck while using dice-loss since the one_hot function that is called in there expects binary masks input and doesn't work as expected for probabilistic masks. I now wrote my own 'soft_label_dice' that handles probabilistic labels in the case of only 2 class labels. I thought this might be an interesting feature for monai since multiple segmentations problems have uncertain ground truth boundaries.

I was wondering what you guys think of this soft labeling strategy. I know other methods exist for increasing noise robustness, but it seemed my model was being punished to hard for making mistakes during training on regions that are only coarsely annotated.
Below I added a snippet with my soft_label_dice function.

Kind regards,
Joris Wuts

`def soft_label_dice(preds, label):
preds = torch.softmax(preds, 1)
# label is of shape (B1H[WD]) having float values ranging from 0-1
label=torch.cat((label,(1-label)),1)

reduce_axis = list(range(2, len(preds.shape)))
nom=torch.sum(torch.pow((preds -label),2), dim=reduce_axis)

ground_o = torch.sum(preds, dim=reduce_axis)
pred_o = torch.sum(label, dim=reduce_axis)

denominator = ground_o + pred_o +0.00001
f: torch.Tensor = nom / denominator 
f = torch.mean(f)
return f`

show the dicom loading usage

would be great to extend the https://github.com/Project-MONAI/Tutorials/blob/master/load_medical_images.ipynb to load DICOM

Do data transforms happen on every yield from the train loader or once at load time?

QUESTION 1:

When I apply a list of transforms as in the Spleen tutorial notebook, do they happen once here:

train_ds = CacheDataset(data=train_files, transform=train_trans, cache_rate=1.0, num_workers=8)

Note that it says

Load and cache transformed data: 100%|██████████| 41/41 [00:15<00:00, 2.65it/s]

The past tense "transformed" seems to indicate that transformations only happen once. Or, after defining

train_loader = DataLoader(train_ds, batch_size=2, shuffle=True, num_workers=loader_workers)

do the transformations actually happen on every reference to an item in the training queue, specifically here:

for batch_data in train_loader:

This is the ideal case for me. In the former case, should I repeat my data 100 times before running it through CacheDataset to get my augmentations? Is that standard? It seems it would be a lot better to do the transformations on the fly. Also very necessary for a subsampling transformation like RandCropByPosNegLabeld.

This could be a dumb question, I just don't see it spelled out in the docs and the logging printed out by CacheDataset.

NOTE: I'm guessing this happens with every train_loader yield, because my training loop has slowed way down. This leads to

QUESTION 2: Would it be possible to do these transforms in the GPU? I'm assuming the slowdown happens because they are on CPU, as shown by the attached picture, which depicts a very lightly loaded GPU and 1 hammered CPU core. This leads to

QUESTION 3: Can I speed up the train loader transformations by adding workers? I'm guessing Yes. If not, should be Yes. I'll try it now.

ANSWER 1&3: Yes it must be happening for each train_loader yield, yes adding workers helps. NOTE: A comment in the tutorial notebook says "because this is cached in memory, you only need one work". This is misleading. And on Question 2: The 8 cores I added are 100% active. The GPU is 10% to 25% loaded max. These transforms should happen on the GPU!! Most of the compute time is spent in the transforms. Very little in the training.

Spleen example crashes if I modify it to use different input dataset

Describe the bug
The example crashes. I tried different roi_sizes, but setting e.g. (-1, -1, -1) just postpones the crash for later in the process.

To Reproduce
Run my notebook which is a modified copy of the spleen example.

Expected behavior
Training finishes after a while.

Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.17.2
Pillow version: 7.2.0
Tensorboard version: 2.3.0

Additional context

----------
epoch 1/10
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-12-8f78531d9ef2> in <module>
     19         )
     20         optimizer.zero_grad()
---> 21         outputs = model(inputs)
     22         loss = loss_function(outputs, labels)
     23         loss.backward()

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\nets\unet.py in forward(self, x)
    125 
    126     def forward(self, x):
--> 127         x = self.model(x)
    128         return x
    129 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
     31 
     32     def forward(self, x):
---> 33         return torch.cat([x, self.submodule(x)], self.cat_dim)
     34 
     35 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
     31 
     32     def forward(self, x):
---> 33         return torch.cat([x, self.submodule(x)], self.cat_dim)
     34 
     35 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
     31 
     32     def forward(self, x):
---> 33         return torch.cat([x, self.submodule(x)], self.cat_dim)
     34 
     35 

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 7 and 8 in dimension 3 at C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensor.cpp:612

TensorboardImageHandler for challenge_baseline

Please can anybody tell me how to use the TensorboardimageHandler for the challnege_baseline script?
What is the output_transform to use?

Develop FL example based on Clara FL

Is your feature request related to a problem? Please describe.
(originally from Project-MONAI/MONAI#498 )We can use MONAI to build many FL examples based on different FL architectures, this issue is to track the development of an example based on NVIDIA Clara FL.

autoencoder_mednist transformations failure

Describe the bug
A clear and concise description of what the bug is.

When running the current version of autoencoder_mednist tutorial it will crash while trying to perform transformations on data. Specifically while creating CasheDataset.

To Reproduce
Steps to reproduce the behavior:
Simply run all the cells until you reach creating the CasheDataset - that's where it crashes

Expected behavior
It should perform the transformations

Additional context

Simple solution I found is to add "reader" parameter to LoadImageD transformation. In case of mednist Hand dataset(which is the default in this tutorial) it should be reader="PILReader" as all the images as .jpg

Fail to load state dict

Hi,

I load the state dict in the same model generated by monai.networks.nets.UNet. However, it reports such an error. Actually, I successfully load the state dict before, but I am not sure what is wrong this time. It seems that the weight has a difference between 'act' and 'adn.A'.

Thank you.

RuntimeError: Error(s) in loading state_dict for UNet:
        Missing key(s) in state_dict: "model.0.conv.unit0.act.weight", "model.0.conv.unit1.act.weight", "model.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.2.0.act.weight", "model.1.submodule.2.1.conv.unit0.act.weight", "model.2.0.act.weight". 
        Unexpected key(s) in state_dict: "model.0.conv.unit0.adn.A.weight", "model.0.conv.unit1.adn.A.weight", "model.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.2.0.adn.A.weight", "model.1.submodule.2.1.conv.unit0.adn.A.weight", "model.2.0.adn.A.weight".

Sliding Window Inference giving error on 0.4.0

Command used for Sliding Window Inference (on Monai 0.4.0 but its working fine on 0.3.0)

Code Snippet:

for val_data in self.val_loader:
    val_step_start_time = time.time()
    val_images, val_labels = val_data["image"].to(self.device), val_data["label"].to(self.device)
    roi_size = (128, 128, 128)
    sw_batch_size = 6
    if amp:
        with torch.cuda.amp.autocast():
            val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network, sw_device=self.device, device=self.device)
    else:
    val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network)

Getting same error with and without amp

Error:

AttributeError Traceback (most recent call last)
/data/archit/Liver/Experiments/monai/main.py in
25
26 if name == "main":
---> 27 main()

/data/archit/Liver/Experiments/monai/main.py in main()
19 if args.continue_training == True:
20 trainer.load_best_checkpoint()
---> 21 trainer.trainProcess(amp=True)
22
23

/data/archit/Liver/Experiments/monai/TeraReconAI/train/segmentationTrainer.py in trainProcess(self, amp)
71 self.initialize_network()
72 amp_start = time.time()
---> 73 super()._trainProcess(amp)
74 amp_total_time = time.time() - amp_start
75 print(f"Total training time with AMP: {amp_total_time:.4f}")

/data/archit/Liver/Experiments/monai/TeraReconAI/train/trainer.py in _trainProcess(self, amp)
168 # else:
169 self.network = self.network.to(self.device)
--> 170 val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network, sw_device=self.device, device=self.device)
171
172

/data/archit/Software/anaconda3/envs/monai/lib/python3.8/site-packages/monai/inferers/utils.py in sliding_window_inference(inputs, roi_size, sw_batch_size, predictor, overlap, mode, sigma_scale, padding_mode, cval, sw_device, device, *args, **kwargs)
127 ]
128 window_data = torch.cat([inputs[win_slice] for win_slice in unravel_slice]).to(sw_device)
--> 129 seg_prob = predictor(window_data, *args, **kwargs).to(device) # batched patch segmentation
130
131 if not _initialized: # init. buffer at the first iteration

AttributeError: 'list' object has no attribute 'to'

U-Net model for lung lesion segmentation model does not run using colab

I tried running the following command python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs" per the instructions and I get the output below in example A. It The model does not seem to be training. I also tried running the inference command python run_net.py infer --data_folder "COVID-19-20_v2/Validation" --model_folder "runs" and I get the error in example B.

When I check the runs folder, I do not see any indication that model ran or checkpoints saved.

I am using google Colab to train to the model.

example A

MONAI version: 0.3.0+57.g70650b8
Python version: 3.6.9 (default, Oct  8 2020, 12:12:24)  [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.51.0

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100% 160/160 [05:18<00:00,  1.99s/it]
Load and cache transformed data: 100% 39/39 [01:21<00:00,  2.10s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
^C

example B

MONAI version: 0.3.0+57.g70650b8
Python version: 3.6.9 (default, Oct  8 2020, 12:12:24)  [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.51.0

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

Traceback (most recent call last):
  File "run_net.py", line 264, in <module>
    infer(data_folder=data_folder, model_folder=args.model_folder)
  File "run_net.py", line 179, in infer
    ckpt = ckpts[-1]
IndexError: list index out of range

project-monai / tutorials Goto Github PK

tutorials's People

Stargazers

Watchers

Forkers

tutorials's Issues

_epoch 2 average loss: 0.8960

Load and cache transformed data: 0%| | 0/9 [00:00<?, ?it/s]

Code Snippet:

Error:

Recommend Projects

Recommend Topics

Recommend Org