project-monai / tutorials Goto Github PK
View Code? Open in Web Editor NEWMONAI Tutorials
Home Page: https://monai.io/started.html
License: Apache License 2.0
MONAI Tutorials
Home Page: https://monai.io/started.html
License: Apache License 2.0
What does this line meaning??
I wanted to reproduce the baseline lesion network
Every time I started training, this line occured. Is there something wrong??
since there're breaking changes since v0.2, we need to rerun and double-check all the examples/tutorials for v0.3
Hi,
I am conducting a segmentation task with only one target structure. Now I try to modify the loss function as Dice + CE loss, then I just change the code as shown here,
class CrossEntropyLoss(nn.Module):
def __init__(self):
super().__init__()
self.loss = nn.CrossEntropyLoss()
def forward(self, y_pred, y_true):
# CrossEntropyLoss target needs to have shape (B, D, H, W)
# Target from pipeline has shape (B, 1, D, H, W)
y_true = torch.squeeze(y_true, dim=1).long()
return self.loss(y_pred, y_true)
class DiceCELoss(nn.Module):
def __init__(self):
super().__init__()
self.dice = monai.losses.DiceLoss(sigmoid = True)
self.cross_entropy = CrossEntropyLoss()
def forward(self, y_pred, y_true):
dice = self.dice(y_pred, y_true)
cross_entropy = self.cross_entropy(y_pred, y_true)
return dice + cross_entropy
loss_function = DiceCELoss()
However, after changing this part, the code seems cannot work now, and reports such kind of error. Could you please help me find what is wrong?
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [893,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [894,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [895,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [34,0,0], thread: [384,0,0] Assertion `t >= 0 && t < n_classes` failed.
Traceback (most recent call last):
File "2D_UNet.py", line 259, in <module>
main()
File "2D_UNet.py", line 191, in main
loss.backward()
File "/home/anaconda3/envs/monai/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/anaconda3/envs/monai/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
Thanks a lot.
Describe the bug
I got a tkinter runtime error related with threads when locally running the spleen_segmentation_3d.ipynb
in the epoch cell.
epoch 12/600
1/16, train_loss: 0.5761
2/16, train_loss: 0.5969
3/16, train_loss: 0.4487
4/16, train_loss: 0.5615
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
5/16, train_loss: 0.5464
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x7f1d2ec583a0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 351, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Exception ignored in: <function Image.__del__ at 0x7f1d2ec71ca0>
Traceback (most recent call last):
File "/usr/lib/python3.8/tkinter/__init__.py", line 4014, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
Traceback (most recent call last):
File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 872, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/usr/lib/python3.8/multiprocessing/queues.py", line 107, in get
if not self._poll(timeout):
File "/usr/lib/python3.8/multiprocessing/connection.py", line 257, in poll
return self._poll(timeout)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 424, in _poll
r = wait([self], timeout)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 931, in wait
ready = selector.select(timeout)
File "/usr/lib/python3.8/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 355179) is killed by signal: Aborted.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "spleen_segmentation_3d.py", line 268, in <module>
for batch_data in train_loader:
File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
data = self._next_data()
File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data
idx, data = self._get_data()
File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1034, in _get_data
success, data = self._try_get_data()
File "/home/phc/.virtualenvs/monai/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 885, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 355179) exited unexpectedly
Environment (please complete the following information):
Additional context
Sorry for the brevitiy of the report. The notebook is run as a python script using jupytext (converts ipynb to py).
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 5/100, Iter: 21/66 -- train_loss: 0.8563
ERROR:ignite.engine.engine.SupervisedTrainer:Current run is terminating due to exception: DataLoader worker (pid 1476) is killed by signal: Killed. .
ERROR:ignite.engine.engine.SupervisedTrainer:Exception: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.
ERROR:ignite.engine.engine.SupervisedTrainer:Engine run is terminating due to exception: DataLoader worker (pid 1476) is killed by signal: Killed. .
ERROR:ignite.engine.engine.SupervisedTrainer:Exception: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 730, in _internal_run
time_taken = self._run_once_on_dataset()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 828, in _run_once_on_dataset
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.
Traceback (most recent call last):
File "run_net.py", line 301, in
train(data_folder=data_folder, model_folder=args.model_folder)
File "run_net.py", line 211, in train
trainer.run()
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 46, in run
super().run()
File "/usr/local/lib/python3.6/dist-packages/monai/engines/workflow.py", line 163, in run
super().run(data=self.data_loader, max_epochs=self.state.max_epochs)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 691, in run
return self._internal_run()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 762, in _internal_run
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 730, in _internal_run
time_taken = self._run_once_on_dataset()
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 828, in _run_once_on_dataset
self._handle_exception(e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 465, in _handle_exception
self._fire_event(Events.EXCEPTION_RAISED, e)
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 423, in _fire_event
func(*first, *(event_args + others), **kwargs)
File "/usr/local/lib/python3.6/dist-packages/monai/handlers/stats_handler.py", line 145, in exception_raised
raise e
File "/usr/local/lib/python3.6/dist-packages/ignite/engine/engine.py", line 811, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/usr/local/lib/python3.6/dist-packages/monai/engines/trainer.py", line 156, in _iteration
self.scaler.step(self.optimizer)
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in step
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/cuda/amp/grad_scaler.py", line 320, in
if not sum(v.item() for v in optimizer_state["found_inf_per_device"].values()):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 1476) is killed by signal: Killed.
subtask of Project-MONAI/MONAI#1318
need to
pip install monai[all]
works for all the notebooks and demosHello, I wanted to ask if anyone has used Wasserstein distance in brain different structures segmentation because I have some issues. For example, the argument that I should pass to my pipeline is a matrix distance and I would like to know how I construct this matrix, I don't know where those numbers come from. And the other question if is the loss of Wasserstein distance is finished and prove in any experiment (not in brain tumor because the labels are continuous and my labels are separated).
Thank you.
I trained the spleen segmentation model for 200 epochs with the decathlon database. Then I evaluated it with my own dataset and the segmentation performance was extremely poor, do you know how I can finetune the model parameters with my own dataset? how should I do that and how many epochs should I do? (My dataset comprehend 20 manually segmented spleens)
Thanks
Aymen
(pytorch) rasho@rasho-WS-E500-G5-WS690T:~/covid-19_3D_Segmentation$ python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"
MONAI version: 0.3.0+81.g62b0bbb
Python version: 3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0]
OS version: Linux (5.4.0-53-generic)
Numpy version: 1.19.2
Pytorch version: 1.7.0
MONAI flags: HAS_EXT = False, USE_COMPILED = False
Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.2.0
scikit-image version: NOT INSTALLED or UNKNOWN VERSION.
Pillow version: 8.0.1
Tensorboard version: NOT INSTALLED or UNKNOWN VERSION.
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: 0.8.1
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.52.0
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100%|█████████████████████████████████████████████████████████████████████| 160/160 [04:13<00:00, 1.58s/it]
Load and cache transformed data: 100%|███████████████████████████████████████████████████████████████████████| 39/39 [00:58<00:00, 1.51s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
INFO:root:epochs 500, lr 0.0001, momentum 0.95
INFO:ignite.engine.engine.SupervisedTrainer:Engine run resuming from iteration 0, epoch 0 until 500 epochs
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 1/80 -- train_loss: 1.4053
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 2/80 -- train_loss: 1.3833
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 3/80 -- train_loss: 1.3598
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 4/80 -- train_loss: 1.3268
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 5/80 -- train_loss: 1.3438
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 6/80 -- train_loss: 1.3146
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 7/80 -- train_loss: 1.3164
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 8/80 -- train_loss: 1.3118
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 9/80 -- train_loss: 1.2970
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 10/80 -- train_loss: 1.2957
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 11/80 -- train_loss: 1.2779
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 12/80 -- train_loss: 1.2499
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 13/80 -- train_loss: 1.2641
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 14/80 -- train_loss: 1.2634
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 15/80 -- train_loss: 1.2439
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 16/80 -- train_loss: 1.2206
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 17/80 -- train_loss: 1.2209
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 18/80 -- train_loss: 1.2143
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 19/80 -- train_loss: 1.1976
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 20/80 -- train_loss: 1.1950
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 21/80 -- train_loss: 1.1833
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 22/80 -- train_loss: 1.1747
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 23/80 -- train_loss: 1.1739
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 24/80 -- train_loss: 1.1676
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 25/80 -- train_loss: 1.1586
INFO:ignite.engine.engine.SupervisedTrainer:Epoch: 1/500, Iter: 26/80 -- train_loss: 1.1585
Killed
Describe the bug
All 3D classification tutorials assume that the data is in 'workspace/data/medical/ixi/IXI-T1/
, but none do the download.
To Reproduce
Expected behavior
Tutorial should be able to run the whole way through without user intervention.
Environment (please complete the following information):
N/A
Describe the bug
Trying to follow https://github.com/Project-MONAI/tutorials/blob/17bf2ec91e2871898198084f4ba5e968c2bef47e/3d_segmentation/spleen_segmentation_3d.ipynb I run into a traceback at step "Execute a typical PyTorch training process".
To Reproduce
I installed everything using pip, in a virtual environment. I needed to allow CPU back-end, as my laptop has AMD GPU: device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
----------
epoch 1/600
---------------------------------------------------------------------------
PicklingError Traceback (most recent call last)
<ipython-input-13-ab25791c97e3> in <module>
12 epoch_loss = 0
13 step = 0
---> 14 for batch_data in train_loader:
15 step += 1
16 inputs, labels = (
c:\dev\monai\pyenv\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
277 return _SingleProcessDataLoaderIter(self)
278 else:
--> 279 return _MultiProcessingDataLoaderIter(self)
280
281 @property
c:\dev\monai\pyenv\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
717 # before it starts, and __del__ tries to join but will get:
718 # AssertionError: can only join a started process.
--> 719 w.start()
720 self._index_queues.append(index_queue)
721 self._workers.append(w)
C:\Dev\Python3.7.9\lib\multiprocessing\process.py in start(self)
110 'daemonic processes are not allowed to have children'
111 _cleanup()
--> 112 self._popen = self._Popen(self)
113 self._sentinel = self._popen.sentinel
114 # Avoid a refcycle if the target function holds an indirect
C:\Dev\Python3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
--> 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):
C:\Dev\Python3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
320 def _Popen(process_obj):
321 from .popen_spawn_win32 import Popen
--> 322 return Popen(process_obj)
323
324 class SpawnContext(BaseContext):
C:\Dev\Python3.7.9\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
87 try:
88 reduction.dump(prep_data, to_child)
---> 89 reduction.dump(process_obj, to_child)
90 finally:
91 set_spawning_popen(None)
C:\Dev\Python3.7.9\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #
PicklingError: Can't pickle <function CropForegroundd.<lambda> at 0x000001E8B1AED318>: attribute lookup CropForegroundd.<lambda> on monai.transforms.croppad.dictionary failed
Environment (please complete the following information):
Windows 10
MONAI version: 0.2.0
Python version: 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)]
Numpy version: 1.19.1
Pytorch version: 1.4.0+cpu
Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.17.2
Pillow version: 7.2.0
Tensorboard version: 2.3.0
Is your feature request related to a problem? Please describe.
As we updated LoadImage
as the recommended loading method, need to update all the examples and tutorials.
Should we be pip installing monai
and its dependencies at the start of each notebook?
Discussion continued from #47.
My personal feeling is that we should lift all pip install
s from our notebooks as the relevant instructions are already in our README.md. It also saves us from having to update as our notebooks/dependencies change.
Describe the bug
I am trying to adapt the spleen_segmentation_3d.ipynb notebook to imaging data with a slightly different shape. The images in the Spleen set are 226x257 with 113 planes in the stack. My images are 1200x340 with 20 planes in the stack. The notebook samples the data in cubes of size 96x96x96. To get the example notebook to work, I have to duplicate my data on the planes to be 20+20+20+20+16 = 96. Otherwise it breaks, for the obvious reason that you can't get 96 slices out of 20.
Suppose however that I change the cube size to 20x20x20, so I don't duplicate planes to match the exact setup of the notebook. I still get a problem. Here is the problem, please let me know how to resolve it:
RuntimeError Traceback (most recent call last)
<ipython-input-15-26b65d7e4120> in <module>
19 )
20 optimizer.zero_grad()
---> 21 outputs = model(inputs)
22 loss = loss_function(outputs, labels)
23 loss.backward()
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/nets/unet.py in forward(self, x)
190
191 def forward(self, x: torch.Tensor) -> torch.Tensor:
--> 192 x = self.model(x)
193 return x
194
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
37
38 def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39 return torch.cat([x, self.submodule(x)], self.cat_dim)
40
41
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
37
38 def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39 return torch.cat([x, self.submodule(x)], self.cat_dim)
40
41
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102
~/anaconda3/envs/mona/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
~/anaconda3/envs/mona/lib/python3.8/site-packages/monai/networks/layers/simplelayers.py in forward(self, x)
37
38 def forward(self, x: torch.Tensor) -> torch.Tensor:
---> 39 return torch.cat([x, self.submodule(x)], self.cat_dim)
40
41
RuntimeError: Sizes of tensors must match except in dimension 2. Got 4 and 3
To Reproduce
Here is the code:
import glob, os, torch
from monai.data import CacheDataset, DataLoader, Dataset
from monai.inferers import sliding_window_inference
from monai.losses import DiceLoss
from monai.metrics import compute_meandice
from monai.networks.layers import Norm
from monai.networks.nets import UNet
from monai.utils import first, set_determinism
data_dir='nf1_monai'
os.environ['MONAI_DATA_DIRECTORY']=data_dir
directory = os.environ.get("MONAI_DATA_DIRECTORY")
root_dir = directory
train_images = sorted(glob.glob(os.path.join(data_dir, "imagesTr", "*.npy")))
train_labels = sorted(glob.glob(os.path.join(data_dir, "labelsTr", "*.npy")))
data_dicts = [
{"image": image_name, "label": label_name}
for image_name, label_name in zip(train_images, train_labels)
]
train_files, val_files = data_dicts[:-10], data_dicts[-10:]
set_determinism(seed=0)
from monai.transforms import (
AddChanneld,
Compose,
LoadNumpyd,
RandCropByPosNegLabeld,
ToTensord,
)
train_transforms = Compose(
[
LoadNumpyd(keys=["image", "label"]),
AddChanneld(keys=["image", "label"]),
RandCropByPosNegLabeld(
keys=["image", "label"],
label_key="label",
spatial_size=(20,20,20),
pos=1,
neg=1,
num_samples=4,
image_key="image",
image_threshold=0,
),
ToTensord(keys=["image", "label"]),
]
)
val_transforms = Compose(
[
LoadNumpyd(keys=["image", "label"]),
AddChanneld(keys=["image", "label"]),
ToTensord(keys=["image", "label"]),
]
)
device = torch.device("cuda:0")
model = UNet(
dimensions=3,
in_channels=1,
out_channels=2,
channels=(16, 32, 64, 128, 256),
strides=(2, 2, 2, 2),
num_res_units=2,
norm=Norm.BATCH,
).to(device)
loss_function = DiceLoss(to_onehot_y=True, softmax=True)
optimizer = torch.optim.Adam(model.parameters(), 1e-4)
train_ds = CacheDataset(data=train_files, transform=train_transforms, cache_rate=1.0, num_workers=1)
train_loader = DataLoader(train_ds, batch_size=6, shuffle=True, num_workers=16)
val_ds = CacheDataset(data=val_files, transform=val_transforms, cache_rate=1.0, num_workers=16)
val_loader = DataLoader(val_ds, batch_size=1, num_workers=1)
epoch_num = 600
val_interval = 2
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = list()
metric_values = list()
for epoch in range(epoch_num):
print("-" * 10)
print(f"epoch {epoch + 1}/{epoch_num}")
model.train()
epoch_loss = 0
step = 0
for batch_data in train_loader:
step += 1
inputs, labels = (
batch_data["image"].to(device),
batch_data["label"].to(device),
)
optimizer.zero_grad()
outputs = model(inputs)
loss = loss_function(outputs, labels)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
print(f"{step}/{len(train_ds) // train_loader.batch_size}, train_loss: {loss.item():.4f}")
epoch_loss /= step
epoch_loss_values.append(epoch_loss)
print(f"epoch {epoch + 1} average loss: {epoch_loss:.4f}")
Expected behavior
The UNet should train and not break.
Environment (please complete the following information):
OS: Ubuntu 20.04LTS
MONAI version: 0.3.0
Python version: 3.8.2 (default, Mar 26 2020, 15:53:00) [GCC 7.3.0]
OS version: Linux (5.4.0-52-generic)
Numpy version: 1.18.1
Pytorch version: 1.5.0
MONAI flags: HAS_EXT = False, USE_COMPILED = False
Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.0
scikit-image version: 0.16.2
Pillow version: 7.1.2
Tensorboard version: 2.2.1
gdown version: 3.12.2
TorchVision version: 0.6.0a0+82fd1c8
ITK version: 5.1.1
tqdm version: 4.50.2
Additional context
I am trying to do tumor detection on whole-body MRI scans. The tumors are small and the body is large. So far this is giving me an average F1 score of 0.17 using this library, training with the 20+20+20+20+16 stacking workaround.
Describe the bug
I just tried running the 2D segmentation tutorial, but on my own 2D images (a mixed dataset of TIFF, PNG, JPG and BMP images). I ran into several problems, e.g. LoadPNGd cannot handle TIFF images, the rest of the transform pipeline throws an error (I think PIL loads TIF images in a different way than others - I usually use skimage.io
, which always returns a numpy array). The biggest problem though is that the transforms pipeline cannot handle the color channel in RGB images, or I am doing sth wrong when applying the Resized()
transform - the latter is necessary because I need images at a fixed size of 320x240 at the end of the transform pipeline.
To Reproduce
Put a few RGB color images (maybe including at least one TIFF image ;) into a directory, then set up a simple transform pipeline like this:
train_transforms = Compose(
[
LoadImaged(keys=["img"]),
LoadNumpyd(keys=["seg"]), # my segs are four channels stored as numpy array, of shape (height,width,4)
ScaleIntensityd(keys="img"),
Resized(keys=["img", "seg"], spatial_size=(240,320), mode='bilinear', align_corners=True),
RandFlipd(keys=["img", "seg"], prob=0.5),
ToTensord(keys=["img", "seg"]),
]
)
Then, to check the shape of the output tensors:
# define check dataset, check data loader
check_ds = monai.data.Dataset(data=train_files, transform=train_transforms)
# use batch_size=2 to load images and use RandCropByPosNegLabeld to generate 2 x 4 images for network training
check_loader = DataLoader(check_ds, batch_size=2, num_workers=4, collate_fn=list_data_collate)
check_data = monai.utils.misc.first(check_loader)
print(check_data["img"].shape, check_data["seg"].shape)
plt.imshow(np.squeeze(check_data["img"][0,0,:,:]))
Expected behavior
If the color channel is handled correctly, I expect the shape of the tensors to be [2,3,240,320].
Observed behavior
The output shape is [2,300,240,320] (please note that in my case, monai.utils.misc.first(check_loader)
loads an image of shape [300,400,3]).
Environment (please complete the following information):
Describe the bug
Hi @ericspod , could you please help add the Colab button and installation from latest MONAI code(as we haven't released 0.4 yet) to the ThreadBuffer notebook?
Thanks.
Hi there -
I'm new to MONAI and doing some learning of the brain tumor segmentation code - referring to the file brats_segmentation_3d.ipynb under tutorials/3d_segmentation. I'm using this code AS-IS in my Jupyter Notebook. While training on the Medical Decathlon dataset, exactly after epoch 2 I see the following error:
ValueError Traceback (most recent call last)
in
54 # metric_sum += value.item() * not_nans
55 # compute mean dice for TC
---> 56 value_tc, not_nans = dice_metric(y_pred=val_outputs[:, 0:1], y=val_labels[:, 0:1])
57 not_nans = not_nans.item()
58 metric_count_tc += not_nans
ValueError: not enough values to unpack (expected 2, got 1)_
Can you please suggest anything to rectify this problem?
Many thanks,
Sekhar H.
Dear all,
After pip installing ITK or SimpleITK the "print_config()" prompt does not find the installed ITK version.
Moreover, while executing the "densenet_training.array.ipynb" tutorial I get this error:
OptionalImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/monai/transforms/utils.py in apply_transform(transform, data, map_items)
308 return [transform(item) for item in data]
--> 309 return transform(data)
310 except Exception as e:
35 frames
OptionalImportError: import itk (No module named 'itk').
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
During handling of the above exception, another exception occurred:
OptionalImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/monai/utils/module.py in optional_import(module, version, version_checker, name, descriptor, version_args, allow_namespace_pkg)
165 actual_cmd = f"import {module}"
166 try:
--> 167 pkg = import(module) # top level module
168 the_module = import_module(module)
169 if not allow_namespace_pkg:
OptionalImportError: Applying transform <monai.transforms.io.dictionary.LoadImaged object at 0x7f7d33d66860>.
Regards,
Sebastian
I am trying to implement torchio and batchgenerators augmentations following this tutorial:
https://github.com/Project-MONAI/Tutorials/blob/master/integrate_3rd_party_transforms.ipynb
The spatial transformations should also affect my label maps, however I don't want to use linear or bspline interpolation which makes sense for image data for my label maps. What is the best way to implement that?
Dear all,
I'm adapting the 3D classifier tutorial "densenet_training" to my example files.
My nifti files have a different size, so I get this error when doing the input to the model:
Expected 5-dimensional input for 5-dimensional weight [64, 1, 7, 7, 7], but got 4-dimensional input of size [2, 224, 224, 160] instead
How can I modify the code so I can test the tutorial on my files?
Thanks
Is your feature request related to a problem? Please describe.
would be great to fix the Anaconda Python distribution with a predefined yml file, such as
https://github.com/Project-MONAI/MONAIBootcamp2020#instal-local-environment
this ticket looks for an automated CI setup to ensure the quality of the notebooks.
see also discussions:
!python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"
Code cell stops after showing this result. Don't know what should I do next, How can I find the models?
MONAI version: 0.3.0+87.ge94e243
Python version: 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False
Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.53.0
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100% 160/160 [05:30<00:00, 2.06s/it]
Load and cache transformed data: 100% 39/39 [01:24<00:00, 2.18s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
^C
It seems that in the majority of tutorials, the optimisation for loop is given explicitly. In relatively few places, the SupervisedTrainer
is used, despite existing for this reason.
I can see why having the explicit for loop is beneficial for tutorials - so that people are more aware of the inner workings. However, for the sake of conciseness, I would be in favour of having just one notebook (named suitably) in which the explicit for loop is given, and then from there on, using the SupervisedTrainer
. Notebooks using SupervisedTrainer
could then refer to the explicit notebook.
I think @ericspod is in favour of leaving the notebooks as they are, so as not to hide anything (which I understand). Anyone else have an opinion?
Thank you for the tutorials!
I can't find any log files saved in ./run and it seems that this part is not included in the code. (./3d_segmentation/baseline)
It would be much clearer if the training information is saved and plotted.
Anthor question is that, do the images under 'Validation' folder have labels (groung truth?) and where is it?
Thank you!
Is your feature request related to a problem? Please describe.
Some notebooks are not following the PEP8 style guide.
Describe the solution you'd like
Please, consider following the PEP8 style guide in the notebooks from MONAI/examples/notebooks/.
For example, in examples/notebooks/mednist_tutorial.ipynb, cell 4 has variables named using the CamelCase style instead snake_case (https://www.python.org/dev/peps/pep-0008/#id45), for example:
dataDir = './MedNIST/'
classNames = os.listdir(dataDir)
numClass = len(classNames)
Later, in the same notebook, the snake_case is adopted.
train_ds = MedNISTDataset(trainX, trainY, train_transforms)
train_loader = DataLoader(train_ds, batch_size=300, shuffle=True, num_workers=10)
val_ds = MedNISTDataset(valX, valY, val_transforms)
val_loader = DataLoader(val_ds, batch_size=300, num_workers=10)
hello, I am the participant of the covid challenge. Now, the submit has been closed. I have a new prediction and I want to know its dice score to do my own research. Could you please open the evaluation website for me or share the evaluation method? It will be better if the organization can release the ground truth labels of test and validation dataset. Thank you so much!
Is your feature request related to a problem? Please describe.
We have very rich network layers, blocks, etc. and support both 2D and 3D, we also have layer factory to generate common layers. But currently, we don't have a step by step tutorial to show how to use the APIs to develop networks.
install reviewnb https://www.reviewnb.com/ on this repo for diff & Commenting pull requests of jupyter notebooks
Describe the bug
The code from A U-Net model for lung lesion segmentation from CT images could not be runned.
The error information is: TypeError: __init__() got an unexpected keyword argument 'to_onehot_y'
.
To Reproduce
Steps to reproduce the behavior:
https://github.com/Project-MONAI/tutorials/tree/master/3d_segmentation/challenge_baseline/
pip install monai
. NOTE: this is important. Different install methods lead to different errors #60python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"
Expected behavior
Start training of the model.
Environment (please complete the following information):
Additional context
https://covid-segmentation.grand-challenge.org/Resource/
Is your feature request related to a problem? Please describe.
As the size and content scope of the MONAI/examples
folder increase,
it's necessary to figure out the hardware/software requirements for running the examples,
and also provide some forms of quality assurance of the example codes.
Describe the solution you'd like
could automatically run the examples as a part of the automated CI/CD pipeline?
Describe alternatives you've considered
manually verifying all the examples regularly (tedious and error-prone)
Additional context
see also https://github.com/Project-MONAI/MONAI/issues/296
Hi,
I was wondering how UNet deals with the sliding window input.
Because the ROI you set is bigger than the patches UNet is trained on.
How does this work?
Thanks.
Kirsten
Describe the bug
The code from A U-Net model for lung lesion segmentation from CT images could not be runned.
The error information is: TypeError: __init__() got an unexpected keyword argument 'to_onehot_y'
.
To Reproduce
Steps to reproduce the behavior:
https://github.com/Project-MONAI/tutorials/tree/master/3d_segmentation/challenge_baseline/
pip install "git+https://github.com/Project-MONAI/MONAI#egg=monai[nibabel,ignite,tqdm]"
python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"
Expected behavior
Start training of the model.
Environment (please complete the following information):
Additional context
https://covid-segmentation.grand-challenge.org/Resource/
GradCAM module is inplace, would be great to have a 3D classification model demo, with a medical image related task
It would be nice to have examples on how to use Learning rate schedulers using the MONAI classes. And It would be nice to have a LR finder like the FASTAI one cycle one.
As the experimental new APIs have been implemented for MONAI I/O Project-MONAI/MONAI#909
this ticket looks for a tutorial to show that:
(might be useful to briefly mention the optional import feature of MONAI)
In a few places (e.g., 3rd cell here), we have:
root_dir = tempfile.mkdtemp if directory is None else directory
which is missing the brackets:
root_dir = tempfile.mkdtemp() if directory is None else directory
Might be worth grepping and replacing all mkdtemp[space]
with mkdtemp()[space]
.
I (and so maybe other new users) would find it really useful if there was a suggested order to some of the tutorials, building in complexity, rather than just alphabetical order.
Hey all,
I am currently using monai to participate in the grand challenge for COVID segmentation. As a baseline model I use the DynUnet with parameters adapted from nn-Net. This works great and gave me a validation dice of around 0.7. To further improve the results I wanted to focus on handeling the noisy ground truth annotations. Since the ground truth annotations from this project are not really clean, I want to implement some form of 'soft labels'. By gaussian smoothing the masks, the probability drops below 1 on the borders of the lesions reflecting the uncertainty of the ground truth annotation.
I tried implementing this with monai building blocks, but I got stuck while using dice-loss since the one_hot function that is called in there expects binary masks input and doesn't work as expected for probabilistic masks. I now wrote my own 'soft_label_dice' that handles probabilistic labels in the case of only 2 class labels. I thought this might be an interesting feature for monai since multiple segmentations problems have uncertain ground truth boundaries.
I was wondering what you guys think of this soft labeling strategy. I know other methods exist for increasing noise robustness, but it seemed my model was being punished to hard for making mistakes during training on regions that are only coarsely annotated.
Below I added a snippet with my soft_label_dice function.
Kind regards,
Joris Wuts
`def soft_label_dice(preds, label):
preds = torch.softmax(preds, 1)
# label is of shape (B1H[WD]) having float values ranging from 0-1
label=torch.cat((label,(1-label)),1)
reduce_axis = list(range(2, len(preds.shape)))
nom=torch.sum(torch.pow((preds -label),2), dim=reduce_axis)
ground_o = torch.sum(preds, dim=reduce_axis)
pred_o = torch.sum(label, dim=reduce_axis)
denominator = ground_o + pred_o +0.00001
f: torch.Tensor = nom / denominator
f = torch.mean(f)
return f`
would be great to extend the https://github.com/Project-MONAI/Tutorials/blob/master/load_medical_images.ipynb to load DICOM
see also Project-MONAI/MONAI#1032
QUESTION 1:
When I apply a list of transforms as in the Spleen tutorial notebook, do they happen once here:
train_ds = CacheDataset(data=train_files, transform=train_trans, cache_rate=1.0, num_workers=8)
Note that it says
Load and cache transformed data: 100%|██████████| 41/41 [00:15<00:00, 2.65it/s]
The past tense "transformed" seems to indicate that transformations only happen once. Or, after defining
train_loader = DataLoader(train_ds, batch_size=2, shuffle=True, num_workers=loader_workers)
do the transformations actually happen on every reference to an item in the training queue, specifically here:
for batch_data in train_loader:
This is the ideal case for me. In the former case, should I repeat my data 100 times before running it through CacheDataset
to get my augmentations? Is that standard? It seems it would be a lot better to do the transformations on the fly. Also very necessary for a subsampling transformation like RandCropByPosNegLabeld
.
This could be a dumb question, I just don't see it spelled out in the docs and the logging printed out by CacheDataset
.
NOTE: I'm guessing this happens with every train_loader
yield, because my training loop has slowed way down. This leads to
QUESTION 2: Would it be possible to do these transforms in the GPU? I'm assuming the slowdown happens because they are on CPU, as shown by the attached picture, which depicts a very lightly loaded GPU and 1 hammered CPU core. This leads to
QUESTION 3: Can I speed up the train loader transformations by adding workers? I'm guessing Yes. If not, should be Yes. I'll try it now.
ANSWER 1&3: Yes it must be happening for each train_loader
yield, yes adding workers helps. NOTE: A comment in the tutorial notebook says "because this is cached in memory, you only need one work". This is misleading. And on Question 2: The 8 cores I added are 100% active. The GPU is 10% to 25% loaded max. These transforms should happen on the GPU!! Most of the compute time is spent in the transforms. Very little in the training.
Describe the bug
The example crashes. I tried different roi_size
s, but setting e.g. (-1, -1, -1)
just postpones the crash for later in the process.
To Reproduce
Run my notebook which is a modified copy of the spleen example.
Expected behavior
Training finishes after a while.
Environment (please complete the following information):
Windows 10
MONAI version: 0.2.0
Python version: 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)]
Numpy version: 1.19.1
Pytorch version: 1.4.0+cpu
Optional dependencies:
Pytorch Ignite version: 0.3.0
Nibabel version: 3.1.1
scikit-image version: 0.17.2
Pillow version: 7.2.0
Tensorboard version: 2.3.0
Additional context
----------
epoch 1/10
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-12-8f78531d9ef2> in <module>
19 )
20 optimizer.zero_grad()
---> 21 outputs = model(inputs)
22 loss = loss_function(outputs, labels)
23 loss.backward()
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
c:\dev\monai\pyenv\lib\site-packages\monai\networks\nets\unet.py in forward(self, x)
125
126 def forward(self, x):
--> 127 x = self.model(x)
128 return x
129
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
31
32 def forward(self, x):
---> 33 return torch.cat([x, self.submodule(x)], self.cat_dim)
34
35
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
31
32 def forward(self, x):
---> 33 return torch.cat([x, self.submodule(x)], self.cat_dim)
34
35
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
98 def forward(self, input):
99 for module in self:
--> 100 input = module(input)
101 return input
102
c:\dev\monai\pyenv\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
c:\dev\monai\pyenv\lib\site-packages\monai\networks\layers\simplelayers.py in forward(self, x)
31
32 def forward(self, x):
---> 33 return torch.cat([x, self.submodule(x)], self.cat_dim)
34
35
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 7 and 8 in dimension 3 at C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensor.cpp:612
Please can anybody tell me how to use the TensorboardimageHandler for the challnege_baseline script?
What is the output_transform to use?
Is your feature request related to a problem? Please describe.
(originally from Project-MONAI/MONAI#498 )We can use MONAI to build many FL examples based on different FL architectures, this issue is to track the development of an example based on NVIDIA Clara FL.
Describe the bug
A clear and concise description of what the bug is.
When running the current version of autoencoder_mednist tutorial it will crash while trying to perform transformations on data. Specifically while creating CasheDataset.
To Reproduce
Steps to reproduce the behavior:
Simply run all the cells until you reach creating the CasheDataset - that's where it crashes
Expected behavior
It should perform the transformations
Additional context
Simple solution I found is to add "reader" parameter to LoadImageD transformation. In case of mednist Hand dataset(which is the default in this tutorial) it should be reader="PILReader" as all the images as .jpg
Hi,
I load the state dict in the same model generated by monai.networks.nets.UNet. However, it reports such an error. Actually, I successfully load the state dict before, but I am not sure what is wrong this time. It seems that the weight has a difference between 'act' and 'adn.A'.
Thank you.
RuntimeError: Error(s) in loading state_dict for UNet:
Missing key(s) in state_dict: "model.0.conv.unit0.act.weight", "model.0.conv.unit1.act.weight", "model.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit1.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.1.submodule.2.0.act.weight", "model.1.submodule.1.submodule.2.1.conv.unit0.act.weight", "model.1.submodule.2.0.act.weight", "model.1.submodule.2.1.conv.unit0.act.weight", "model.2.0.act.weight".
Unexpected key(s) in state_dict: "model.0.conv.unit0.adn.A.weight", "model.0.conv.unit1.adn.A.weight", "model.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.0.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.1.submodule.conv.unit1.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.1.submodule.2.0.adn.A.weight", "model.1.submodule.1.submodule.2.1.conv.unit0.adn.A.weight", "model.1.submodule.2.0.adn.A.weight", "model.1.submodule.2.1.conv.unit0.adn.A.weight", "model.2.0.adn.A.weight".
Command used for Sliding Window Inference (on Monai 0.4.0 but its working fine on 0.3.0)
for val_data in self.val_loader:
val_step_start_time = time.time()
val_images, val_labels = val_data["image"].to(self.device), val_data["label"].to(self.device)
roi_size = (128, 128, 128)
sw_batch_size = 6
if amp:
with torch.cuda.amp.autocast():
val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network, sw_device=self.device, device=self.device)
else:
val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network)
Getting same error with and without amp
AttributeError Traceback (most recent call last)
/data/archit/Liver/Experiments/monai/main.py in
25
26 if name == "main":
---> 27 main()
/data/archit/Liver/Experiments/monai/main.py in main()
19 if args.continue_training == True:
20 trainer.load_best_checkpoint()
---> 21 trainer.trainProcess(amp=True)
22
23
/data/archit/Liver/Experiments/monai/TeraReconAI/train/segmentationTrainer.py in trainProcess(self, amp)
71 self.initialize_network()
72 amp_start = time.time()
---> 73 super()._trainProcess(amp)
74 amp_total_time = time.time() - amp_start
75 print(f"Total training time with AMP: {amp_total_time:.4f}")
/data/archit/Liver/Experiments/monai/TeraReconAI/train/trainer.py in _trainProcess(self, amp)
168 # else:
169 self.network = self.network.to(self.device)
--> 170 val_outputs = sliding_window_inference(val_images, roi_size, sw_batch_size, self.network, sw_device=self.device, device=self.device)
171
172
/data/archit/Software/anaconda3/envs/monai/lib/python3.8/site-packages/monai/inferers/utils.py in sliding_window_inference(inputs, roi_size, sw_batch_size, predictor, overlap, mode, sigma_scale, padding_mode, cval, sw_device, device, *args, **kwargs)
127 ]
128 window_data = torch.cat([inputs[win_slice] for win_slice in unravel_slice]).to(sw_device)
--> 129 seg_prob = predictor(window_data, *args, **kwargs).to(device) # batched patch segmentation
130
131 if not _initialized: # init. buffer at the first iteration
AttributeError: 'list' object has no attribute 'to'
I tried running the following command python run_net.py train --data_folder "COVID-19-20_v2/Train" --model_folder "runs"
per the instructions and I get the output below in example A. It The model does not seem to be training. I also tried running the inference command python run_net.py infer --data_folder "COVID-19-20_v2/Validation" --model_folder "runs"
and I get the error in example B.
When I check the runs
folder, I do not see any indication that model ran or checkpoints saved.
I am using google Colab to train to the model.
example A
MONAI version: 0.3.0+57.g70650b8
Python version: 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False
Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.51.0
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
INFO:root:training: image/label (199) folder: COVID-19-20_v2/Train
INFO:root:training: train 160 val 39, folder: COVID-19-20_v2/Train
INFO:root:batch size 2
Load and cache transformed data: 100% 160/160 [05:18<00:00, 1.99s/it]
Load and cache transformed data: 100% 39/39 [01:21<00:00, 2.10s/it]
BasicUNet features: (32, 32, 64, 128, 256, 32).
^C
example B
MONAI version: 0.3.0+57.g70650b8
Python version: 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
OS version: Linux (4.19.112+)
Numpy version: 1.18.5
Pytorch version: 1.7.0+cu101
MONAI flags: HAS_EXT = False, USE_COMPILED = False
Optional dependencies:
Pytorch Ignite version: 0.4.2
Nibabel version: 3.0.2
scikit-image version: 0.16.2
Pillow version: 7.0.0
Tensorboard version: 2.3.0
gdown version: 3.6.4
TorchVision version: 0.8.1+cu101
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.51.0
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
Traceback (most recent call last):
File "run_net.py", line 264, in <module>
infer(data_folder=data_folder, model_folder=args.model_folder)
File "run_net.py", line 179, in infer
ckpt = ckpts[-1]
IndexError: list index out of range
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.