Code Monkey home page Code Monkey logo

worc's Introduction

Your Image Alt Text

WORC v3.6.3

Workflow for Optimal Radiomics Classification

Information

Unit test Documentation PyPi Citing WORC

Introduction

WORC is an open-source python package for the easy execution and fully automatic construction and optimization of radiomics workflows. For more details, we refer to the WORC paper: https://doi.org/10.48550/arXiv.2108.08618.

Overview

We aim to establish a general radiomics platform supporting easy integration of other tools. With our modular build and support of different software languages (python, MATLAB, ruby, java etc.), we want to facilitate and stimulate collaboration, standardisation and comparison of different radiomics approaches. By combining this in a single framework, we hope to find a universal radiomics strategy that can address various problems.

License

This package is covered by the open source APACHE 2.0 License.

When using WORC, please cite both this repository and the paper describing WORC as follows:

@article{starmans2021reproducible,
   title          = {Reproducible radiomics through automated machine learning validated on twelve clinical applications}, 
   author         = {Martijn P. A. Starmans and Sebastian R. van der Voort and Thomas Phil and Milea J. M. Timbergen and Melissa Vos and Guillaume A. Padmos and Wouter Kessels and David    Hanff and Dirk J. Grunhagen and Cornelis Verhoef and Stefan Sleijfer and Martin J. van den Bent and Marion Smits and Roy S. Dwarkasing and Christopher J. Els and Federico Fiduzi and Geert J. L. H. van Leenders and Anela Blazevic and Johannes Hofland and Tessa Brabander and Renza A. H. van Gils and Gaston J. H. Franssen and Richard A. Feelders and Wouter W. de Herder and Florian E. Buisman and Francois E. J. A. Willemssen and Bas Groot Koerkamp and Lindsay Angus and Astrid A. M. van der Veldt and Ana Rajicic and Arlette E. Odink and Mitchell Deen and Jose M. Castillo T. and Jifke Veenland and Ivo Schoots and Michel Renckens and Michail Doukas and Rob A. de Man and Jan N. M. IJzermans and Razvan L. Miclea and Peter B. Vermeulen and Esther E. Bron and Maarten G. Thomeer and Jacob J. Visser and Wiro J. Niessen and Stefan Klein},
   year           = {2021},
   eprint         = {2108.08618},
   archivePrefix  = {arXiv},
   primaryClass   = {eess.IV}
}

@software{starmans2018worc,
  author       = {Martijn P. A. Starmans and Thomas Phil and Sebastian R. van der Voort and Stefan Klein},
  title        = {Workflow for Optimal Radiomics Classification (WORC)},
  year         = {2018},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3840534},
  url          = {https://github.com/MStarmans91/WORC}
}

For the DOI, visit .

Disclaimer

This package is still under development. We try to thoroughly test and evaluate every new build and function, but bugs can off course still occur. Please contact us through the channels below if you find any and we will try to fix them as soon as possible, or create an issue on this Github.

Tutorial, documentation and dataset

The WORC tutorial is hosted at https://github.com/MStarmans91/WORCTutorial.

The official documentation can be found at https://worc.readthedocs.io.

I've developed WORC during my PhD, you can find the thesis revolving around it here: http://hdl.handle.net/1765/137089.

The publicly released WORC database is described in the following paper:

@article {Starmans2021WORCDatabase,
	author = {Starmans, Martijn P.A. and Timbergen, Milea J.M. and Vos, Melissa and Padmos, Guillaume A. and Gr{\"u}nhagen, Dirk J. and Verhoef, Cornelis and Sleijfer, Stefan and van Leenders, Geert J.L.H. and Buisman, Florian E. and Willemssen, Francois E.J.A. and Koerkamp, Bas Groot and Angus, Lindsay and van der Veldt, Astrid A.M. and Rajicic, Ana and Odink, Arlette E. and Renckens, Michel and Doukas, Michail and de Man, Rob A. and IJzermans, Jan N.M. and Miclea, Razvan L. and Vermeulen, Peter B. and Thomeer, Maarten G. and Visser, Jacob J. and Niessen, Wiro J. and Klein, Stefan},
	title = {The WORC database: MRI and CT scans, segmentations, and clinical labels for 930 patients from six radiomics studies},
	elocation-id = {2021.08.19.21262238},
	year = {2021},
	doi = {10.1101/2021.08.19.21262238},
	URL = {https://www.medrxiv.org/content/early/2021/08/25/2021.08.19.21262238},
	eprint = {https://www.medrxiv.org/content/early/2021/08/25/2021.08.19.21262238.full.pdf},
	journal = {medRxiv}
}

The code to download the WORC database and reproduce our experiments can be found at https://github.com/MStarmans91/WORCDatabase.

If you run into any issues, feel free to make an issue on this Github. We advise you to read the FAQ first though: https://worc.readthedocs.io/en/latest/static/faq.html.

Installation

NOTE: Yes, we by default run a very old version of Python and some old package versions (e.g., sklearn 0.23) due to hardware constraints. We have however already prepared a release which runs under Python 3.11 and more recent packages, which you could alternatively use: https://github.com/MStarmans91/WORC/tree/newpython. While not all features are fully tested, the default experimental setups work.

WORC supports Unix and Windows systems with Python 3.6 and 3.7: the unit tests are performed on the latest Ubuntu and Windows versions with Python 3.7. For detailed installation instructions, please check the ReadTheDocs installation guidelines.

The package can be installed through pip:

  pip install WORC

Alternatively, you can directly install WORC from this repository:

  python setup.py install

Make sure you install the requirements first:

  pip install -r requirements.txt

3rd-party packages used in WORC:

See for other python packages the requirements file.

Start

We suggest you start with the WORC Tutorial. Besides a Jupyter notebook with instructions, we provide there also an example script for you to get started with.

Contact

We are happy to help you with any questions. Please sent us a mail or place an issue on the Github.

We welcome contributions to WORC. For the moment, converting your toolbox into a FASTR tool is satisfactory: see also the fastr tool development documentation.

Optional extra features

Besides the default installation, there are several optional packages you could install to support WORC.

Graphviz

WORC can draw the network and save it as a SVG image using graphviz. In order to do so, please make sure you install graphviz. On Ubuntu, simply run

  apt install graphiv

On Windows, follow the installation instructions provided on the graphviz website. Make sure you add the executable to the PATH when prompted.

Elastix

Image registration is included in WORC through elastix and transformix. In order to use elastix, please download the binaries and place them in your fastr.config.mounts['apps'] path. Check the elastix tool description for the correct subdirectory structure. For example, on Linux, the binaries and libraries should be in "../apps/elastix/4.8/install/" and "../apps/elastix/4.8/install/lib" respectively.

Note: optionally, you can tell WORC to copy the metadata from the image file to the segmentation file before applying the deformation field. This requires ITK and ITKTools: see the ITKTools github for installation instructions.

XNAT

We use the XNATpy package to connect the toolbox to the XNAT online database platforms. You will only need this when you use the example dataset we provided, or if you want to download or upload data from or to XNAT. We advise you to specify your account settings in a .netrc file when using this feature for your own datasets, such that you do not need to input them on every request.

worc's People

Contributors

lyhyl avatar mitchelldeen avatar mstarmans91 avatar sikerdebaard avatar svdvoort avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

worc's Issues

[BUG] encounter divide by zero error when using multilabel tag

Describe the bug
At the step of performance/plot_Estimator, it report
"stats.py:7028: RuntimeWarning: divide by zero encountered in double_scalars\n z = (bigu - meanrank) / sd",
the "sd" is always zero.

And i found it is due to StatisticalTestThreshold.py:86. In this line, the "class1" world always be an empty list.
I update the code from line 85-87 and it world work succesfully

#before
  for i in range(n_label):
      class1 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) == n_label]
      class2 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) != n_label]
#after
  for i_label in range(n_label):
      class1 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) == i_label]
      class2 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) != i_label]

WORC configuration
modus = 'multiclass_classification'

Label example

Patient Label_1 Label_2 Label_3 Label_4
id_1 0 1 0 0
id_2 1 0 0 0
id_3 1 0 0 0
id_4 0 0 1 0
id_5 0 0 0 1

Logging error

Traceback (most recent call last):`
File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 434, in format
return self._format(record)
File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 430, in _format
return self._fmt % record.dict
KeyError: 'tracking_id'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 1083, in emit
msg = self.format(record)
File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 927, in format
return fmt.format(record)
File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 666, in format
s = self.formatMessage(record)
File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 635, in formatMessage
return self._style.format(record)
File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 436, in format
raise ValueError('Formatting field not found in record: %s' % e)
ValueError: Formatting field not found in record: 'tracking_id'
Call stack:
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/resources/fastr_tools/worc/bin/PlotEstimator.py", line 89, in
main()
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/resources/fastr_tools/worc/bin/PlotEstimator.py", line 77, in main
plot_estimator_performance(prediction=args.prediction,
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/plotting/plot_estimator_performance.py", line 438, in plot_estimator_performance
fitted_model.create_ensemble(X_train_temp, Y_train_temp,
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/classification/SearchCV.py", line 1428, in create_ensemble
base_estimator.refit_and_score(X_train, Y_train, p_all,
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/classification/SearchCV.py", line 912, in refit_and_score
out = fit_and_score(X_fit, y, self.scoring,
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/classification/fitandscore.py", line 770, in fit_and_score
StatisticalSel.fit(X_train, y)
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/featureprocessing/StatisticalTestThreshold.py", line 90, in fit
metric_value_temp = self.metric_function(class1, class2, **self.parameters)[1]
File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/scipy/stats/stats.py", line 7028, in mannwhitneyu
z = (bigu - meanrank) / sd
File "/home/z/anaconda3/envs/worc/lib/python3.9/warnings.py", line 109, in _showwarnmsg
sw(msg.message, msg.category, msg.filename, msg.lineno,
Message: '%s'
Arguments: ('/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/scipy/stats/stats.py:7028: RuntimeWarning: divide by zero encountered in `double_scalars\n z = (bigu - meanrank) / sd\n',)

Desktop (please complete the following information):

  • OS: Ubuntu 22.04.1 LTS
  • Python version: 3.9.12
  • WORC Version: 3.6.0

A dot ('.') is added to a process yaml filepath

Describe the bug
When running the BasicWORC tutorial, it runs without any issue using the 'predict/CalcFeatures:1.0' configuration for FeatureCalculators. However, when using the 'pyradiomics/Pyradiomics:1.0', the features are not calculated. I ran a fastr trace like this:
fastr trace $RUNDIR/WORC_Example_STWStrategyHN_BasicWORC/tmp/__sink_data__.json --sinks features_train_CT_0_pyradiomics --sample HN1006_0 -v
The output is this:
fastr.exceptions.FastrFileNotFound: FastrFileNotFound from $PYTHONENVPATH/envs/worc/lib/python3.7/site-packages/fastr/abc/serializable.py line 152: Could not find file Could not open $RUNDIR/WORC_Example_STWStrategyHN_BasicWORC/tmp/.calcfeatures_train_pyradiomics_Pyradiomics_1_0_CT_0/HN1006_0/__fastr_result__.yaml for reading

In the tmp folder, the path exists under name "calcfeatures_train_pyradiomics_Pyradiomics_1_0_CT_0". But for some reason fastr is looking for a hidden folder ".calcfeatures_train_pyradiomics_Pyradiomics_1_0_CT_0" that doesn't exist.

WORC configuration
Default config with overrides on "FeatureCalculators" to use pyradiomics.
In WORC_config.py, the mount folders for tmp and output were changed as well to be inside the $RUNDIR/experimentfolder

Desktop (please complete the following information):

  • OS: [Ubuntu 22.04.3 LTS]
  • Python version: [3.7.4]
  • WORC Version [3.6.3]

Update
It seems the problem is not only related to pyradiomics. For some reason, sometimes fastr keeps looking for a hidden folder. For example, for all the failed sinks (Barchart_PNG, classification, StatisticalTestFeatures_CSV, etc) fastr is looking for a hidden folder .classify instead of the existing folder classify:
fastr trace $RUNDIR/tmp/__sink_data__.json --sinks Barchart_PNG -v --sample all [WARNING] __init__:0084 >> Not running in a production installation (branch "unknown" from installed package) Tracing errors for sample all from sink Barchart_PNG Located result pickle: $RUNDIR/tmp/.classify/all/__fastr_result__.yaml Traceback (most recent call last): File "$HOME/miniconda3/envs/worc/lib/python3.7/site-packages/fastr/abc/serializable.py", line 142, in loadf with open_func(path, 'rb') as fin: FileNotFoundError: [Errno 2] No such file or directory: '$RUNDIR/tmp/.classify/all/__fastr_result__.yaml'

Windows Installation

(PREDICT 2.1.3 and WORC 2.1.3)
Make sure you install PREDICT and WORC in an empty virtual environment (i.e. no site-packages pre-installed). After installing WORC on a Windows system, use the following fixes:

Find the WORC.py file at (venv)/Lib/site-packages/WORC/ . At line 900 change
self.fastrconfigs.append(os.path.join("vfs://tmp/", self.name, ("config_{}_{}.ini").format(self.name, num)))
into
self.fastrconfigs.append(os.path.join(cfile))

Furthermore, find contour_function.py at (venv)/Lib/site-packages/PREDICT/helpers/ .
At line 52 insert this line just after the for loop starts:
boundary_index = boundary_index.astype(np.double)

If you try the tutorial, make sure you use this format to create the sources:

glob.glob(('C:/Data/STWStrategyMMD/*/image.nii.gz'))
[i.replace(('C:'), 'C:') for i in image_sources]
{os.path.basename(os.path.dirname(i)): i for i in image_sources}

Problems running tutorial

I have been trying to run the WORC Simple tutorial for some time but I keep getting the same error:

Traceback (most recent call last):
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\validators\preflightcheck.py", line 123, in _validate
    label_data = load_labels(labels_file_train)
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\processing\label_processing.py", line 49, in load_labels
    label_file)
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\processing\label_processing.py", line 131, in load_label_csv
    raise ae.WORCAssertionError('First column should be patient ID!')
WORC.addexceptions.WORCAssertionError: First column should be patient ID!

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "H:/PycharmProjects/RadiomicsPipeline/WORCTutorialSimple.py", line 199, in <module>
    main()
  File "H:/PycharmProjects/RadiomicsPipeline/WORCTutorialSimple.py", line 120, in main
    experiment.execute()
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\facade\simpleworc.py", line 68, in dec
    raise e
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\facade\simpleworc.py", line 64, in dec
    func(*args, **kwargs)
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\facade\simpleworc.py", line 655, in execute
    self._validate()
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\facade\simpleworc.py", line 68, in dec
    raise e
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\facade\simpleworc.py", line 64, in dec
    func(*args, **kwargs)
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\facade\simpleworc.py", line 478, in _validate
    validator.do_validation(self)
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\validators\preflightcheck.py", line 33, in do_validation
    result = self._validate(*args, **kwargs)
  File "C:\Program Files\Python3.7.9\lib\site-packages\WORC\validators\preflightcheck.py", line 127, in _validate
    raise ae.WORCValueError(f'First column in the file given to SimpleWORC().labels_from_this_file(**) needs to be named Patient.')
WORC.addexceptions.WORCValueError: First column in the file given to SimpleWORC().labels_from_this_file(**) needs to be named Patient.

I have made a file with the path H:\PycharmProjects\Data\Examplefiles\pinfo_HN but this also doesn't seem to work (see attached file). How can I fix this?
pinfo_HN.csv

Using multiple ROIs per patient

I'm trying to run WORC with multiple ROIs per patient, the amount of ROIs per patient range from 1 to 30. When I use the same CT data but only one ROI the experiment works but when using multiple ROIs the experiment does not work. It doesn't create an output file but also does not stop the run, it kind of freezes and is unable to create the config file and feature files.

The error occured during execution of the experiment, this is what the error during the experiment says:
Encountered exception (IndexError) during callback fastr.execution.networkrun.job_finished:
Traceback (most recent call last):
File "/nfs/home/clbente/anaconda3/envs/WORC2/lib/python3.7/site-packages/fastr/plugins/executionplugin.py", line 613, in _job_finished_body
self._finished_callback(result)
File "/nfs/home/clbente/anaconda3/envs/WORC2/lib/python3.7/site-packages/fastr/execution/networkrun.py", line 800, in job_finished
node.inputs[input_id][input_argument_element.index].failed_annotations))
File "/nfs/home/clbente/anaconda3/envs/WORC2/lib/python3.7/site-packages/fastr/execution/inputoutputrun.py", line 147, in getitem
value = sub[key]
File "/nfs/home/clbente/anaconda3/envs/WORC2/lib/python3.7/site-packages/fastr/execution/inputoutputrun.py", line 606, in getitem
return self.source[0][key]
File "/nfs/home/clbente/anaconda3/envs/WORC2/lib/python3.7/site-packages/fastr/execution/linkrun.py", line 164, in getitem
source_sample_data = self.source[sourceindex]
File "/nfs/home/clbente/anaconda3/envs/WORC2/lib/python3.7/site-packages/fastr/execution/inputoutputrun.py", line 1420, in getitem
return self.linearized[item[0]]
IndexError: tuple index out of range

I'm using a BasicWORC object in which the images and segmentations file paths are given using dictionaries as provided in the user manual: images1 = {'patient1_0': '/data/Patient1/image_MR.nii.gz', 'patient1_1': '/data/Patient1/image_MR.nii.gz'}

  • OS: Linux
  • Python Version: 3.7
  • WORC Version 3.6.2

[BUG] No fitting ensemble found in function create_ensemble

Describe the bug
I used 4 clinical paramters as semantic features in WORC. They are age, gender, pain and motor symptoms. Except age, the other features are categorical, e.g. 0,1,2.

The workflow managed to create the hdf5 file in classification, but it failed to find a fitted ensamble.
In the temporary files tmp/{exp_name}/classify/all/classification_0.hdf5 exists.
I tried to use the hdf5 file to build ensemble again for the furthur analysis, but when running create_ensemble the process failed.

One index out of range error found in WORC/classification/SearchCV.py, line 1491 and 1492.
If there is no fitted ensemble found during the whole search, enum adds to 100 and causes index out of range error in line 1492.

Now the problem is if I fix the index issue, there is still no fitted ensemble.
Can I conclude that for the four features I used, there is no fitting model among the models used in WORC.

WORC configuration
In the config file, only sematic features are used in the feature selection.

Expected behavior
Report the fitting fails for the chosen features.

Desktop (please complete the following information):

  • OS: Mac OS
  • Python version: 3.6.8
  • WORC Version 3.6.8

Additional context
No

ValueError: No performance file performance_all_0.json found: your network has failed.

following information:

  • OS: Ubuntu
  • Python version: 3.7
  • WORC version : 3.6.3
  • pyradiomics version: 2.2.0

When I run the code, I receive the following error:

ValueError: No performance file performance_all_0.json found: your network has failed.

I have also noticed that the following jobs have failed or been cancelled:

-1-: calcfeatures_train_pyradiomics_Pyradiomics_1_0_MRI_0___MDD_0012_0 with status JobState.failed

-2-: featureconverter_train_pyradiomics_Pyradiomics_1_0_MRI_0___MDD_0012_0 with status JobState.cancelled

I am not sure what is causing these errors and would be grateful for any assistance you could provide. I have checked the documentation but have not been able to find a solution.

Would you be able to suggest any steps I can take to resolve this issue? I would be happy to provide any additional information that may be helpful.

Thank you for your time and consideration.

[BUG] missing Key: extraction_mode in WORC config file

Describe the bug
WORC seems incompatible with PREDICTFastr 3.1.14+ release, due to missing key: extraction_mode introduced in PREDICTFastr 3.1.14.

WORC configuration
SimpleWORC config file, here shown only the ImageFeatures:

[ImageFeatures]
shape = True
histogram = True
orientation = True
texture_Gabor = True
texture_LBP = True
texture_GLCM = True
texture_GLCMMS = True
texture_GLRLM = False
texture_GLSZM = False
texture_NGTDM = False
coliage = False
vessel = True
log = True
phase = True
image_type = CT
gabor_frequencies = 0.05, 0.2, 0.5
gabor_angles = 0, 45, 90, 135
GLCM_angles = 0, 0.79, 1.57, 2.36
GLCM_levels = 16
GLCM_distances = 1, 3
LBP_radius = 3, 8, 15
LBP_npoints = 12, 24, 36
phase_minwavelength = 3
phase_nscale = 5
log_sigma = 1, 5, 10
vessel_scale_range = 1, 10
vessel_scale_step = 2
vessel_radius = 5
dicom_feature_tags = 0010 1010, 0010 0040
dicom_feature_labels = age, sex

fastr trace

[WARNING]  __init__:0084 >> Not running in a production installation (branch "develop" from source code)
Tracing errors for sample Lipo-100 from sink features_train_CT_0_predict
Located result pickle: /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/featureconverter_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/__fastr_result__.yaml

===== JOB WORC_LIPO_classification_minus___featureconverter_train_predict_CalcFeatures_1_0_CT_0___Lipo-100 =====
Network: WORC_LIPO_classification_minus
Run: WORC_LIPO_classification_minus_2022-04-01T14-49-29
Node: featureconverter_train_predict_CalcFeatures_1_0_CT_0
Sample index: (0)
Sample id: Lipo-100
Status: JobState.cancelled
Timestamp: 2022-04-01 12:57:00.622826
Job file: /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/featureconverter_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/__fastr_result__.yaml

----- ERRORS -----
- FastrError: job cancelled by fastr DRMAA plugin (/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/fastr/resources/plugins/executionplugins/drmaaexecution.py:300)
------------------

No process information:
Cannot find process information in Job information, processing probably got killed.
If there are no other errors, this is often a result of too high memory use or
exceeding some other type of resource limit.

Output data:
{}

Status history:
2022-04-01 12:57:00.622843: JobState.created
2022-04-01 13:08:31.203427: JobState.queued
2022-04-01 13:11:35.373959: JobState.cancelled
Located result pickle: /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/features_train_CT_0_predict/Lipo-100_0/__fastr_result__.yaml


===== JOB WORC_LIPO_classification_minus___features_train_CT_0_predict___Lipo-100___0 =====
Network: WORC_LIPO_classification_minus
Run: WORC_LIPO_classification_minus_2022-04-01T14-49-29
Node: features_train_CT_0_predict
Sample index: (0)
Sample id: Lipo-100
Status: JobState.cancelled
Timestamp: 2022-04-01 12:57:05.028554
Job file: /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/features_train_CT_0_predict/Lipo-100_0/__fastr_result__.yaml

----- ERRORS -----
- FastrError: job cancelled by fastr DRMAA plugin (/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/fastr/resources/plugins/executionplugins/drmaaexecution.py:300)
------------------

No process information:
Cannot find process information in Job information, processing probably got killed.
If there are no other errors, this is often a result of too high memory use or
exceeding some other type of resource limit.

Output data:
{}

Status history:
2022-04-01 12:57:05.028571: JobState.created
2022-04-01 13:08:36.418976: JobState.queued
2022-04-01 13:11:35.436503: JobState.cancelled
Located result pickle: /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/calcfeatures_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/__fastr_result__.yaml


===== JOB WORC_LIPO_classification_minus___calcfeatures_train_predict_CalcFeatures_1_0_CT_0___Lipo-100 =====
Network: WORC_LIPO_classification_minus
Run: WORC_LIPO_classification_minus_2022-04-01T14-49-29
Node: calcfeatures_train_predict_CalcFeatures_1_0_CT_0
Sample index: (0)
Sample id: Lipo-100
Status: JobState.execution_failed
Timestamp: 2022-04-01 12:51:50.614785
Job file: /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/calcfeatures_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/__fastr_result__.yaml

----- ERRORS -----
- FastrOutputValidationError: Output value [HDF5] "vfs://tmp/WORC_LIPO_classification_minus/calcfeatures_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/features_0.hdf5" not valid for datatype "'HDF5'" (/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/fastr/execution/job.py:1151)
- FastrOutputValidationError: The output "features" is invalid! (/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/fastr/execution/job.py:1099)
- FastrErrorInSubprocess: Traceback (most recent call last):
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/WORC/resources/fastr_tools/predict/bin/CalcFeatures_tool.py", line 72, in <module>
    main()
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/WORC/resources/fastr_tools/predict/bin/CalcFeatures_tool.py", line 68, in main
    semantics_file=args.sem)
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/PREDICT/CalcFeatures.py", line 74, in CalcFeatures
    config = config_io.load_config(parameters)
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/PREDICT/IOparser/config_io_CalcFeatures.py", line 94, in load_config
    settings['ImageFeatures']['extraction_mode']
  File "/cm/shared/apps/python3/3.6.8/lib/python3.6/configparser.py", line 1233, in __getitem__
    raise KeyError(key)
KeyError: 'extraction_mode'
 (/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/fastr/execution/executionscript.py:111)
- FastrValueError: Output values are not valid! (/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/fastr/execution/job.py:830)
------------------

Command:
List representation: ['python', '/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/WORC/resources/fastr_tools/predict/bin/CalcFeatures_tool.py', '--im', '/scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/preprocessing_train_CT_0/Lipo-100/image_0.nii.gz', '--out', '/scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/calcfeatures_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/features_0.hdf5', '--seg', '/scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/segmentix_train_CT_0/Lipo-100/segmentation_out_0.nii.gz', '--para', '/scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/config_CT_0/id_0/result/config_WORC_LIPO_classification_minus_0.ini']

String representation: python /home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/WORC/resources/fastr_tools/predict/bin/CalcFeatures_tool.py --im /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/preprocessing_train_CT_0/Lipo-100/image_0.nii.gz --out /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/calcfeatures_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/features_0.hdf5 --seg /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/segmentix_train_CT_0/Lipo-100/segmentation_out_0.nii.gz --para /scratch/dspaanderman/tmp/WORC_LIPO_classification_minus/config_CT_0/id_0/result/config_WORC_LIPO_classification_minus_0.ini


Output data:
{'features': [<HDF5: 'vfs://tmp/WORC_LIPO_classification_minus/calcfeatures_train_predict_CalcFeatures_1_0_CT_0/Lipo-100/features_0.hdf5'>]}

Status history:
2022-04-01 12:51:50.614800: JobState.created
2022-04-01 13:11:07.530470: JobState.running
2022-04-01 13:11:28.013769: JobState.execution_failed

----- STDOUT -----

------------------

----- STDERR -----
Traceback (most recent call last):
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/WORC/resources/fastr_tools/predict/bin/CalcFeatures_tool.py", line 72, in <module>
    main()
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/WORC/resources/fastr_tools/predict/bin/CalcFeatures_tool.py", line 68, in main
    semantics_file=args.sem)
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/PREDICT/CalcFeatures.py", line 74, in CalcFeatures
    config = config_io.load_config(parameters)
  File "/home/dspaanderman/WORCTutorial/env/lib/python3.6/site-packages/PREDICT/IOparser/config_io_CalcFeatures.py", line 94, in load_config
    settings['ImageFeatures']['extraction_mode']
  File "/cm/shared/apps/python3/3.6.8/lib/python3.6/configparser.py", line 1233, in __getitem__
    raise KeyError(key)
KeyError: 'extraction_mode'

------------------

To Reproduce
Fresh Install with the newest version of WORC and PREDICTFastr.

Expected behavior
WORC config file should include extraction_mode in order to determine whether to use 2D, 2.5D or 3D feature extraction.

Desktop:

  • OS: Scientific Linux release 6.10 (Carbon)
  • Python version: 3.6.8
  • WORC Version: 3.5.0

fastr - found HDF5, which is not found in the typelist

When I use fastr for the trainclassifier function, I get the following error:

File "", line 1, in
trainclassifier.trainclassifier(test, patientinfo, config, output_hdf, output_json)

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\trainclassifier.py", line 226, in trainclassifier
tempsave=config['General']['tempsave'])

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\classification\crossval.py", line 295, in crossval
**config['HyperOptimization'])

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\classification\parameter_optimization.py", line 82, in random_search_parameters
random_search.fit(features, labels)

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\processing\SearchCV.py", line 1628, in fit
return self._fit(X, y, groups, sampled_params)

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\processing\SearchCV.py", line 1319, in fit
estimator_data = network.create_source('HDF5', id
='estimator_source')

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\fastr\core\network.py", line 543, in create_source
source_node = SourceNode(datatype=datatype, id_=id_, parent=self, nodegroup=nodegroup)

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\fastr\core\node.py", line 1243, in init
raise exceptions.FastrValueError(message)

FastrValueError: [fastr:///networks/PREDICT_GridSearch_E7F5941TN9/0.0/nodelist/estimator_source] Unknown DataType for SourceNode fastr:///networks/PREDICT_GridSearch_E7F5941TN9/0.0/nodelist/estimator_source (found HDF5, which is not found in the typelist)!

There was no config file in my $HOME/.fastr/ folder, but also after putting it there ($HOME/.fastr/config.d/PREDICT_config.py), I got the same error.

classify job has failed

Traceback (most recent call last):
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/WORC/resources/fastr_tools/predict/bin/TrainClassifier.py", line 64, in
main()
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/WORC/resources/fastr_tools/predict/bin/TrainClassifier.py", line 60, in main
fixedsplits=args.fs)
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/PREDICT/trainclassifier.py", line 120, in trainclassifier
load_features(feat_train, patientinfo_train, label_type)
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/PREDICT/trainclassifier.py", line 276, in load_features
label_type, modnames)
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/PREDICT/IOparser/file_io.py", line 77, in load_data
image_features)
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/PREDICT/genetics/genetic_processing.py", line 179, in findmutationdata
mutation_data_temp = load_mutation_status(patientinfo, mutation_type)
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/PREDICT/genetics/genetic_processing.py", line 41, in load_mutation_status
genetic_file)
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/PREDICT/genetics/genetic_processing.py", line 79, in load_genetic_file
data = np.loadtxt(input_file, np.str)
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/numpy/lib/npyio.py", line 1101, in loadtxt
for x in read_data(_loadtxt_chunksize):
File "/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/numpy/lib/npyio.py", line 1025, in read_data
% line_num)
ValueError: Wrong number of columns at line 2

Test with already trained workflow

I have run a couple of experiments and I was wondering whether it is possible to obtain the trained workflow and apply this workflow to a new dataset to see how the workflow performs on a new dataset?

Evaluate two or more test sets individually

Is it possible to evaluate two or more test sets individually?
Train on A, test on B, C, D, ... and output corresponding performances (perf-B, perf-C, perf-D, ...)?
I have tried to set images_test and segmentations_test as follow:

experiment.images_train.append(A_img_train)
experiment.segmentations_train.append(A_seg_train)

experiment.images_test.append(A_img_test)
experiment.segmentations_test.append(A_seg_test)
experiment.images_test.append(B_img)
experiment.segmentations_test.append(B_test)

...

experiment.add_evaluation()

experiment.set_multicore_execution()
experiment.execute()

But only one performance/evaluation is outputed.
In addition, I have checked estimator_all_0.hdf5. It seems that only A_img_test and A_seg_test are used in testing phase.

Labeling Issue - No entry found

I've got an issue in which the classify_all job fails. Manually running the call command raises the following error:
PREDICT.addexceptions.PREDICTIOError: No entry found in labeling for feature file c:\users\oliver\appdata\local\temp\LiTS_Classification\calcfeatures_train_CT_0\id_0\features_0.hdf5.
It looks like an issue of loading the correct labels within PREDICT\trainclassifier.py . However, I double check the pinfo.txt file containg the labels. It has the same format as the tutorial example (separation with spaces) and contains the headers 'Patient' and 'Label1' of which the latter is also correct in the config.genetics.
The PatientID's correspond with the foldernames containing the image.nii.gz and mask.nii.gz .
What have I missed?

The patient_info_file used: pinfo_small.txt

Too many files can cause the fingerprinter to fail on Windows

I follow the tutorial and fed my own data. The network execution finished but most of jobs/classification tasks failed.

...
 [INFO]   noderun:0592 >> Creating job for node fastr:///networks/WORC_BCMS_SY/0.0/runs/WORC_BCMS_SY_2023-05-30T21-36-06/nodelist/classification sample id <SampleId ('all',)>, index <SampleIndex (0)>
 [INFO] networkrun:0654 >> Queueing job WORC_BCMS_SY___classification___all___0
 [INFO]   noderun:0592 >> Creating job for node fastr:///networks/WORC_BCMS_SY/0.0/runs/WORC_BCMS_SY_2023-05-30T21-36-06/nodelist/performance sample id <SampleId ('all',)>, index <SampleIndex (0)>
 [INFO] networkrun:0654 >> Queueing job WORC_BCMS_SY___performance___all___0
 [INFO] networkrun:0657 >> Waiting for execution to finish...
 [INFO] networkrun:0806 >> Finished job WORC_BCMS_SY___convert_im_train_MRI_0___P137996 with status JobState.finished
 [INFO] networkrun:0806 >> Finished job WORC_BCMS_SY___convert_im_train_MRI_0___P141824 with status JobState.finished
...
 [WARNING] executionplugin:0341 >> Job WORC_BCMS_SY___featureconverter_train_predict_CalcFeatures_1_0_MRI_0___P744962 is no longer under processing, cannot cancel!
 [WARNING] executionplugin:0341 >> Job WORC_BCMS_SY___featureconverter_train_predict_CalcFeatures_1_0_MRI_0___P745979 is no longer under processing, cannot cancel!
 [WARNING] executionplugin:0341 >> Job WORC_BCMS_SY___featureconverter_train_predict_CalcFeatures_1_0_MRI_0___P80292 is no longer under processing, cannot cancel!
 [WARNING] executionplugin:0341 >> Job WORC_BCMS_SY___featureconverter_train_predict_CalcFeatures_1_0_MRI_0___P84221 is no longer under processing, cannot cancel!
 [WARNING] executionplugin:0341 >> Job WORC_BCMS_SY___featureconverter_train_predict_CalcFeatures_1_0_MRI_0___P85705 is no longer under processing, cannot cancel!
 [WARNING] executionplugin:0341 >> Job WORC_BCMS_SY___featureconverter_train_predict_CalcFeatures_1_0_MRI_0___P8632 is no longer under processing, cannot cancel!
 [WARNING] executionplugin:0341 >> Job WORC_BCMS_SY___featureconverter_train_predict_CalcFeatures_1_0_MRI_0___P92423 is no longer under processing, cannot cancel!
 [INFO] networkrun:0806 >> Finished job WORC_BCMS_SY___config_classification_sink___all___0 with status JobState.finished
 [INFO] networkrun:0686 >> Chunk execution finished!
 [INFO] executionplugin:0523 >> Callback processing thread for LinearExecution ended!
 [INFO] networkrun:0688 >> ####################################
 [INFO] networkrun:0689 >> #    network execution FINISHED    #
 [INFO] networkrun:0690 >> ####################################
 [INFO] simplereport:0026 >> ===== RESULTS =====
 [INFO] simplereport:0036 >> classification: 0 success / 0 missing / 1 failed
 [INFO] simplereport:0036 >> config_MRI_0_sink: 0 success / 0 missing / 1 failed
 [INFO] simplereport:0036 >> config_classification_sink: 1 success / 0 missing / 0 failed
 [INFO] simplereport:0036 >> features_train_MRI_0_predict: 0 success / 0 missing / 328 failed
 [INFO] simplereport:0036 >> performance: 0 success / 0 missing / 1 failed
 [INFO] simplereport:0036 >> segmentations_out_segmentix_train_MRI_0: 0 success / 0 missing / 328 failed
 [INFO] simplereport:0037 >> ===================
 [WARNING] simplereport:0049 >> There were failed samples in the run, to start debugging you can run:

    fastr trace E:/WORC/Tmp\__sink_data__.json --sinks

see the debug section in the manual at https://fastr.readthedocs.io/en/develop/static/user_manual.html#debugging for more information.

I run fastr trace E:/WORC/Tmp\__sink_data__.json --sinks and get:

 [WARNING]  __init__:0084 >> Not running in a production installation (branch "unknown" from installed package)
classification -- 1 failed -- 0 succeeded
config_MRI_0_sink -- 1 failed -- 0 succeeded
config_classification_sink -- 0 failed -- 1 succeeded
features_train_MRI_0_predict -- 328 failed -- 0 succeeded
performance -- 1 failed -- 0 succeeded
segmentations_out_segmentix_train_MRI_0 -- 328 failed -- 0 succeeded

Running on windows, python 3.7, install via pip. WORC_config.py:

import os
import fastr
import pkg_resources
import site
import sys

# Get directory in which packages are installed
working_set = pkg_resources.working_set
requirement_spec = pkg_resources.Requirement.parse('WORC')
egg_info = working_set.find(requirement_spec)
if egg_info is None:  # Backwards compatibility with WORC2
    try:
        packagedir = site.getsitepackages()[0]
    except AttributeError:
        # Inside virtualenvironment, so getsitepackages doesnt work.
        paths = sys.path
        for p in paths:
            if os.path.isdir(p) and os.path.basename(p) == 'site-packages':
                packagedir = p
else:
    packagedir = egg_info.location

# Add the WORC FASTR tools and type paths
tools_path = [os.path.join(packagedir, 'WORC', 'resources', 'fastr_tools')] + tools_path
types_path = [os.path.join(packagedir, 'WORC', 'resources', 'fastr_types')] + types_path

# Mounts accessible to fastr virtual file system
mounts['worc_example_data'] = os.path.join(packagedir, 'WORC', 'exampledata')
mounts['apps'] = os.path.expanduser(os.path.join('~', 'apps'))
# mounts['output'] = os.path.expanduser(os.path.join('~', 'WORC', 'output'))
mounts['output'] = "E:\\WORC\\output"
mounts['home'] = "E:\\WORC"
mounts['test'] = os.path.join(packagedir, 'WORC', 'resources', 'fastr_tests')

# The ITKFile type requires a preferred type when no specification is given.
# We will set it to Nifti, but you may change this.
preferred_types += ["NiftiImageFileCompressed"]

How to debug / find out any thing misconfigured?
BTW, does WORC has any pause-and-resume mechanism?

[BUG]

Describe the bug
After pip install WORC all libs like scikit-learn and imbalanced-learn are downgraded to versions which have conflicts with other libraries. I get the following error when importing WORC:

[WARNING] init:0082 >> Not running in a production installation (branch "unknown" from installed package)
AttributeError: module 'numpy' has no attribute 'float'.
np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

Downgrading numpy to version 1.19 gives additional problems with matplotlib.
Seems like dependencies here are not up to date.

WORC configuration
Not any changes applied.

To Reproduce
pip install worc==3.6.0
python
import WORC

Desktop:

  • OS: [Ubuntu Linux]
  • Python version: [3.9.15]
  • WORC Version [3.6.0]

Use features_from_this_directory

Hi!

I try to use WORC using the features that I already extracted using PyRadiomics. I made hdf5 files for every patient containing the features using the example that you provided. However, when I run WORC it says that the first column should be patient, do I add the patient number as a feature value and feature label?

image

config.ini

--I tried to use an specific type of configuration on WORC , defined in a config.ini file , in my scrip I define it like this :

config = ['/home/jtovar/worc_prostate/config_Melanoma_0830_0_original.ini'] or like this
config = '/home/jtovar/worc_prostate/config_Melanoma_0830_0_original.ini

network.configs.append(config)

--When i run the scrip it gives me the same error:

Traceback (most recent call last):
File "worc_example.py", line 62, in
network.build()
File "/home/jtovar/worc_prostate/worc_pro/lib/python2.7/site-packages/WORC/WORC.py", line 318, in build
self.build_training()
File "/home/jtovar/worc_prostate/worc_pro/lib/python2.7/site-packages/WORC/WORC.py", line 344, in build_training
image_types.append(self.configs[c]['ImageFeatures']['image_type'])
TypeError: list indices must be integers, not str

ZeroDivisionError: Weights sum to zero, can't be normalized

When running the trainclassifier function, I ran into the following error:

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\processing\SearchCV.py", line 618, in _store
array_means[:, np.newaxis]) ** 2,

TypeError: list indices must be integers, not tuple

I tried removing [:, np.newaxis], which resulted into the following error:

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\processing\SearchCV.py", line 619, in _store
axis=1, weights=weights))

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\numpy\lib\function_base.py", line 386, in average
"Weights sum to zero, can't be normalized")

ZeroDivisionError: Weights sum to zero, can't be normalized

So I'm getting zeros in my test_scores, is the problem within my data, or can't I just remove [:, np.newaxis]?

(btw, line 611 in Search_CV has a '{' too many.)

Max value in specificity range is over 1 in the final report

Describe the bug
In the final performance reports (performance_all_0.json), the max value in the specificity range is over 1.

WORC configuration
no customization

fastr trace

To Reproduce

Expected behavior
Specificity values in the rage of (0,1)

Desktop (please complete the following information):

  • OS: Mac OS
  • Python version: 3.6.8
  • WORC Version 3.6.3

Additional context
no.

trainclassifier error

When I tried to run the trainclassifier, I ran into this error:

File "", line 1, in
trainclassifier.trainclassifier(test, patientinfo, config, output_hdf, output_json)

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\trainclassifier.py", line 226, in trainclassifier
tempsave=config['General']['tempsave'])

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\classification\crossval.py", line 295, in crossval
**config['HyperOptimization'])

File "C:\Users\Gebruiker\Anaconda3\envs\py27\lib\site-packages\PREDICT\classification\parameter_optimization.py", line 80, in random_search_parameters
verbose=1, cv=cv)

TypeError: init() takes at least 2 arguments (7 given)

I tried changing the tempsave and hyperoptimization parameters in the config file, but that didn't change anything.

classification job cancelled not valid hdf5 datatype

I tried to run a network classification task and I got this message in the sink_data_.json file :
"classification": {
"all": {
"status": "JobState.cancelled",
"errors": [
[
"WORC_prostate_test___classify___all",
"JobState.failed",
"Encountered error: [FastrOutputValidationError] Output value [HDF5] "vfs://tmp/prostate_test/classify/all/classification_0.hdf5" not valid for datatype "HDF5" (/home/jose/anaconda3/envs/worc_pro/lib/python2.7/site-packages/fastr/execution/job.py:1087)",
"./classify/all/fastr_result.pickle.gz"

Problems to install worc

Trying to install WORC i get the following message:

Traceback (most recent call last):
  File "<string>", line 17, in <module>
  File "/home/jtovar/worc_prostate/worc_pros/build/cryptography/setup.py", line 28, in <module>
    "cryptography requires setuptools 18.5 or newer, please upgrade to a "
RuntimeError: cryptography requires setuptools 18.5 or newer, please upgrade to a newer version of setuptools
Complete output from command python setup.py egg_info:
Traceback (most recent call last):

File "", line 17, in

File "/home/jtovar/worc_prostate/worc_pros/build/cryptography/setup.py", line 28, in

"cryptography requires setuptools 18.5 or newer, please upgrade to a "

RuntimeError: cryptography requires setuptools 18.5 or newer, please upgrade to a newer version of setuptools

Running Tutorial

After hearing the ECR talk, I wanted to try out WORC and tried to run the tutorial. The problem is, if I try to execute the experiment (without any changes) I get the following error (from file C:\Users\USERNAME\AppData\Local\Temp\GS\W4AMFZHJ1L\tmp\fitandscore\id_0__0__0_fastr_result_.yaml)

I used pip to install WORC, because I couldn't find the package using conda.

Traceback (most recent call last):
File "c:\users\sa37\phd\code\radiomics\worc\venv\lib\site-packages\WORC\resources\fastr_tools\worc\bin\fitandscore_tool.py", line 84, in
main()
File "c:\users\sa37\phd\code\radiomics\worc\venv\lib\site-packages\WORC\resources\fastr_tools\worc\bin\fitandscore_tool.py", line 58, in main
ret = Parallel(
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\joblib\parallel.py", line 1032, in call
while self.dispatch_one_batch(iterator):
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\joblib\parallel.py", line 847, in dispatch_one_batch
self._dispatch(tasks)
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\joblib\parallel.py", line 765, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\joblib_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\joblib_parallel_backends.py", line 572, in init
self.results = batch()
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\joblib\parallel.py", line 252, in call
return [func(*args, **kwargs)
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\joblib\parallel.py", line 252, in
return [func(*args, **kwargs)
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\WORC\classification\fitandscore.py", line 243, in fit_and_score
imputer.fit(feature_values)
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\WORC\featureprocessing\Imputer.py", line 81, in fit
self.Imputer.fit(X, y)
File "c:\Users\sa37\PhD\Code\Radiomics\WORC\venv\lib\site-packages\missingpy\knnimpute.py", line 231, in fit
raise ValueError("There are only %d samples, but n_neighbors=%d."
ValueError: There are only 8 samples, but n_neighbors=9.

And the experiment fails. Do you have any idea what could cause this problem? I tried it in the debugger, there are a few sample beforehand, where it does work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.