The arcana-legacy from monashbi

Implement the possibility to get a list of desired outputs instead of just one.

Switch to using Traits instead of dtype in FieldSpec and OptionSpec

FieldSpec and OptionSpec should use the traits package instead of the Python int|float|str classes directly. This should allow the specification of more complex types if required (e.g. list of floats), which could be permitted on a case by case basis (e.g. if the archive supports it)

Add 'single_visit' option to local repository and rename to DirectoryRepository

Add an option 'single_visit' to the LocalRepository (to be renamed DirectoryRepository) to avoid having to restructure the project directories to include the visit sub-directories.

Separate source nodes for different data frequencies

Formalise pipeline "mods" and replace TranslatedPipeline

Instead of passing kwargs in pipeline getters as kwargs to create_pipeline method, they should be passed in as a required dictionary arg "mods", to ensure that they are always passed through.

Also, they should be able to take a name_map and store it in the pipeline itself, mapping any connections to inputs and outputs using the name map. This will bring modification functionality only currently available in MulitStudy's to all Study classes and reduce the need to create explicit factory methods (i.e. methods can be modified in sub-classes without altering the base class)

Implement ID mapping in repository base class

Add default_input kwarg to FilesetSpec for standard templates

Pre-computed outputs check is not working properly in local archive

When restarting a crashed job, the checking of the local archive always return nothing to do as the final output was already there. This happen even if there are few pre-computed outputs and to run the pipeline I have to delete both the outputs from the archive and the working directory.

Make Spec.particular grab from study._tree_cache

Rename LocalRepository to PlainDirRepository

Rename 'per_project' to 'per_study'

Since project isn't referenced anywhere else in the package, should probably call 'per_project' 'per_study' instead

Factor out requirement handling into Environment class

Write unit tests for regex, order and dicom field matching

Unittests are required to check regex, order and DICOM field matching performed by DatasetMatch

Provenance data to be stored in archive

Pipelines options/version information needs to be stored in the archive alongside the data, and then checked before new runs are performed.

Add method to extract data to FileFormat

FileFormat should either have an optional argument that takes a function that can extract a data array from the given dataset (so it can be plotted), or optionally formats which this possible for should extend the FileFormat class

passing pipeline requirements to map_node

It seems that when you use pipeline.create_map_node the requirements the you specify when the map_node is created are not passed to all the sub-nodes. This cause command not found error for all of them.
Check if it is possible to pass the requirements to all the sub-nodes.

Split data spec into Acquired, Derived and Input/Output

Currently, data specs are determined to be acquired or derived depending on whether a pipeline name is provided or not. Probably would be cleaner to define two separate classes for each as they have different attributes and methods. Also for the specs that are provided as input/outputs to the Pipeline.init should probably be called as such, i.e. DatasetInput or DatasetOutput.

Install local XNAT instance in travis.yml

Edit the .travis.yml to install a local XNAT instance (using the Docker-compose script) so unit tests can be run locally

Debug directory=True data_formats

Rework pipeline syntax to match Nipype 2.0

With Nipype updating its syntax, it could be easier and simpler to jump ahead and already implement it in Arcana to avoid having to change it later when more pipelines are written in it.

In addition to environment modules run nodes within containers

In addition to the environment modules loading code (or perhaps in preference to it). Run all pipeline nodes within singularity containers if singularity it present.

URIs to the singularity containers can be kept in Requirement objects.

For env modules, need some code to map version names when creating a Runner.

Dataset matching via DICOM fields

Datasets should be able to be distinguished (matched) on the basis of their dicom fields (e.g. gre field mapping phase or mag). The new DatasetMatch infrastructure should make this possible

Implement Processor for all Nipype plugins

Allow explicit Dataset and Fields to be passed as Study inputs

Allow explicit datasets and fields to be passed to Study inputs (in ExplicitDatasets|ExplicitFields objects). Iterables of Dataset|Field objects should be able to be passed as an input when using the dictionary inputs form.

Will need to implement the match(subject_id=None, visit_id=None) method, drawing the appropriate subject and visit ids from the provided datasets (although will typically be of 'per_project' frequency).

This will enable templates (e.g. atlases) to be passed to studies as inputs. A class attribute default_inputs could also be used to specify default templates to use for particular pipelines.

BidsSelector input to Filesetspec

Add bids kwarg_ to FilesetSpecs to pass a BidsSelector object.
Also add bids_run kwarg to Study and pick pass it on to the selector when the class is initialised

Implement FieldValue for LocalArchive

Strip out switch_specs in favour of using SwitchSpec in parameter_spec

Add subject_id and visit_id to BaseDatum

All datasets|fields should know which session|visit|subject they belong to (if any)

Rework unittests to remove nianalysis references

Edit unit tests to use basic dataset types (e.g. text_format) that can be created on the fly to avoid the need to download data from XNAT and use data_formats defined in nianalysis.

Reprocessing handling in Study class

Abstract out tree building into repository base class

Port code to Python 3

Using the future package, make arcana Python 2+3 compatible

Assertion error when trying to run motion detection using XNATArchive

Error message:
File "run_motion_detection.py", line 72, in
run_md(args.input_dir, dynamic=args.dynamic_md, xnat_id=args.xnat_id)
File "run_motion_detection.py", line 48, in run_md
visit_ids=[session_id], work_dir=WORK_PATH)
File "/Users/francescosforazzini/git/NiAnalysis/nianalysis/pipeline.py", line 189, in run
self.connect_to_archive(complete_workflow, **kwargs)
File "/Users/francescosforazzini/git/NiAnalysis/nianalysis/pipeline.py", line 310, in connect_to_archive
visit_ids=visit_ids)
File "/Users/francescosforazzini/git/NiAnalysis/nianalysis/archive/xnat.py", line 771, in project
processed=processed),
File "/Users/francescosforazzini/git/NiAnalysis/nianalysis/archive/xnat.py", line 888, in _get_datasets
multiplicity=mult, location=None))
File "/Users/francescosforazzini/git/NiAnalysis/nianalysis/dataset.py", line 161, in init
super(Dataset, self).init(name, format, multiplicity)
File "/Users/francescosforazzini/git/NiAnalysis/nianalysis/dataset.py", line 86, in init
super(BaseDataset, self).init(name=name, multiplicity=multiplicity)
File "/Users/francescosforazzini/git/NiAnalysis/nianalysis/dataset.py", line 24, in init
assert isinstance(name, basestring)
AssertionError

If I use local archive instead of xnat archive everything works fine.
In order to reproduce this error you can run the script in my mbi-analysis branch (resting_state) called assertion_error.py. It is in mbi-analysis/debug

Handle old work directories better to control when they are rerun

Maybe have a flag that enables you to reuse old work directories when generating data

Remove Fileset/FieldSpec from pipeline inputs and outputs

Create modified SlurmPlugin to submit only long jobs to the que

Create a custom SlurmPlugin (that can also work with SGE) that only submits long jobs to the que and processes short book keeping nodes locally.

Implement submission over SSH with paramiko, so the main pipeline container can run on the XNAT server.

Optionally send progress information to a PIMS server

Runner classes

Instead of running pipelines explicitly, pipelines should be run when requesting a dataset/field. For this to happen, the Study object needs to have its own "Runner" object to determine how and where the processing pipelines are run (e.g. locally single/multi process or submitted to SLURM scheduler)

rename create_pipeline to new_pipeline

ID mapping in Repository object

Repositories that are to combined into a single study may not have the same ID scheme, and for XNAT repositories the session ID can depended on the subject and project ID (at least in the case of MBI-XNAT). So there needs to be a custom way to map the IDs provided to the study and those of the repository.

A pair of lambda functions or a IDMapper object might be a good solution

Add study meta class

Similar to the MultiStudyMetaClass, should write a meta class that all Study classes should use to construct class members such as data_specs and default_options

Come up a good way to set BIDS run number for sub-studies

For sub-studies that encapsulate different runs of the same type within a session (e.g. multiple fMRI tasks), need to come up with a way to conveniently set the BIDS "run" number and have all the default bids matches updated

Finish off BIDSRepository.get_tree()

Install and implement ArcanaDerivedSession datatype

Instead of creating new "MR Sessions" to store derived data, the QIB datatype should be used instead.

Adding BIDS support

Extension of LocalRepository
Specify which kwargs need to be specified for each Study field. Those could be passed to pybids queries. Alternatively use a function pointer with BIDSLayout argument passed for more complex queries (think fieldmaps).

More on pybids: https://github.com/INCF/pybids/tree/master/bids/grabbids

Names for *Match objects taken from dictionary, format from found files

When passing inputs to a study as a dictionary, *Match objects should be allowed not to have names, and just take the name from the dictionary key.
Similarly for formats, these should be allowed to be an optional input that is detected from the matches themselves (which should be checked against the provided format if provided)

Make BIDSArchive derive from LocalArchive

Move derived inputs into sessions named by study

Instead of storing in *_PROC sessions (on XNAT) derived inputs should be stored in separate sessions for each study, e.g. MRH017_001_MR01_MYANALYSIS.

This would allow us to remove the study-name prefix for the derived datasets/fields, although care will have to be take to allow it to remain for sub-study prefixes.

For the local archive, the derived outputs should be stored in separate sub-directories.

This should be a bit neater but will also make it easier to store and retrieve provenance data.

BaseDataset/BaseField objects will need to have an additional 'study' field to specify which study they were derived from and the 'get_tree' methods will need to search for all studies that are listed in the input matches.

bug when checking previously computed outputs

It seems there is a bug when trying to restart a crashed workflow using the local archive (I haven't tested with xnat). When checking the precomputed outputs it always return that there is nothing to do even if the final output has not been already produced. In order to make the pipeline work you need to delete the cache dir and all the previously produced outputs.

monashbi / arcana-legacy Goto Github PK

arcana-legacy's People

Contributors

Stargazers

Watchers

Forkers

arcana-legacy's Issues

Recommend Projects

Recommend Topics

Recommend Org