Code Monkey home page Code Monkey logo

neuroconv's Introduction

PyPI version Daily Tests Auto-release codecov documentation Python Code Style License

NeuroConv logo

Automatically convert neurophysiology data to NWB

Explore our documentation »

Table of Contents

About

NeuroConv is a Python package for converting neurophysiology data in a variety of proprietary formats to the Neurodata Without Borders (NWB) standard.

Features:

  • Reads data from 40 popular neurophysiology data formats and writes to NWB using best practices.
  • Extracts relevant metadata from each format.
  • Handles large data volume by reading datasets piece-wise.
  • Minimizes the size of the NWB files by automatically applying chunking and lossless compression.
  • Supports ensembles of multiple data streams, and supports common methods for temporal alignment of streams.

Installation

We always recommend installing and running Python packages in a clean environment. One way to do this is via conda environments:

conda create --name <give the environment a name> --python <choose a version of Python to use>
conda activate <environment name>

To install the latest stable release of neuroconv though PyPI, run:

pip install neuroconv

To install the current unreleased main branch (requires git to be installed in your environment, such was via conda install git), run:

pip install git+https://github.com/catalystneuro/neuroconv.git@main

NeuroConv also supports a variety of extra dependencies that can be specified inside square brackets, such as

pip install "neuroconv[openephys, dandi]"

which will then install extra dependencies related to reading OpenEphys data as well as the usage of the DANDI CLI (such as automatic upload to the DANDI Archive).

You can read more about these options in the main installation guide.

Documentation

See our ReadTheDocs page for full documentation, including a gallery of all supported formats.

License

NeuroConv is distributed under the BSD3 License. See LICENSE for more information.

neuroconv's People

Contributors

alejoe91 avatar alessandratrapani avatar bendichter avatar codycbakerphd avatar dependabot[bot] avatar felixp8 avatar garrettmflynn avatar h-mayorquin avatar juliasprenger avatar luiztauffer avatar magland avatar pauladkisson avatar pre-commit-ci[bot] avatar rly avatar saksham20 avatar sbuergers avatar simon-ball avatar tabedzki avatar tuanpham96 avatar vigji avatar weiglszonja avatar wuffi avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

neuroconv's Issues

check ISO 8601 duration

DANDI archive requires that Species.age be ISO 8601 duration format. We should check that it is. This can be done in json-schema with regex. There might be an easier way.

[Documentation]: ABF conversion in gallery

What would you like changed or added to the documentation and why?

ABF conversion in gallery

Do you have any interest in helping write or edit the documentation?

No.

Code of Conduct

Next steps to increase Coverage

Recently we accomplished a long-term goal of reaching 90 % coverage on the repo:
catalystneuro/nwb-conversion-tools#277

I am writing here the next steps for the next long term project of increasing this even further, let's say 95 %. The followings is a list of what I believe are the remaining big tickets that can yield big increases in coverage:

Tools

  • neo (this is the module with less coverage).
  • roiextractors (we have the get_epochs functions that we are not using).
  • spikeinterface (mainly the functions concerning adding waveforms).

Data interfaces

  • Basesorting (has some functionality to add electrode information that we are not testing)
  • Axona (this module has a lot of functions and one data interface that we are not testing).
  • Cell explorer (a lot of functionality that I think belongs properly to the extractor is kept there untested).

Utils

  • Some of the functionalities in the dict module are not tested.

[Documentation]: add intersphinx

What would you like changed or added to the documentation and why?

Our docs, particularly our API docs, make references to objects in other libraries such as spikeinterface, roiextractors, spikeextractors, neo, etc. Intersphinx turns these references into functioning links just like internal documentation.

Do you have any interest in helping write or edit the documentation?

No.

Code of Conduct

Update remaining RecordingInterfaces to use SpikeInterface backend

  • IntanRecordingInterface
  • BlackrockRecordingExtractorInterface
  • BlackrockSortingExtractorInterface
  • CED [Not tested in the old backend]
  • Cell Explorer [Not available in spikeinterface]
  • Neuralynx
  • NeuroscopeSortingExtractor
  • OpenEphys
  • PhySortingInterface
  • SpikeGadgetsRecordingInterface
  • Axona
  • Axona LFP

@h-mayorquin Please add the others, thanks! (Done!)

reading TDT

Can we support reading TDT data? It looks like this is not yet supported by SpikeInterface, but is supported by python-neo.

modular interface-based installation configurations

NeuroConv has many dependencies that are specific to particular data formats. We have been working on how to make imports modular. This works right now, but we might end up with a problem where different libraries have conflicting dependencies, so it would not be possible to install all of them. Therefore, I think it is also important to make installation modular. Currently, we have a few different installation options. You can run:

pip install neuroconv

for the minimal dependencies and

pip install neuroconv[full]

for the full dependencies. I propose a solution where you can specify modalities during install and pip will know to install the dependencies specifically for those modalities, e.g.:

pip install neuroconv[openephys]

would install the minimum plus pyopenephys. A user would also be able to specify multiple dependencies, e.g.:

pip install neuroconv[spikeglx,kilosort,phy,deeplabcut]

and the environment would be set up with the necessary libraries for those data sources.

In order to accomplish this, we will need to formally define the requirements of each type of source data. One way we can do this is to have requirements.txt files in each data type directory. For instance:
neuroconv/src/neuroconv/datainterfaces/behavior/deeplabcut/requirements.txt would be:

dlc2nwb>=0.2

The setup.py file would use glob to iterate through all of the datainterface modules and create a extra_requires dict for each source data type.

[Documentation]: Remove old tutorials and their interfaces

What would you like changed or added to the documentation and why?

All NeuroConv interfaces should operate on files that are on the users disk, and all tutorials should harness our example data from GIN to make this happen (though preferably try it out on their own data as well).

Thus the old tutorials here: https://github.com/catalystneuro/neuroconv/tree/main/tutorials

should be removed as well as their interfaces: https://github.com/catalystneuro/neuroconv/tree/main/src/neuroconv/datainterfaces/ecephys/tutorial

which are built on synthetic in-memory SpikeInterface objects.

To replace these, some recent tutorials (including of the .yml feature) were put together for the user days here: https://github.com/NeurodataWithoutBorders/nwb_hackathons/tree/main/HCK13_2022_Janelia/projects/neuroconv_tutorial

Do you have any interest in helping write or edit the documentation?

Yes.

Code of Conduct

Discussion: Unify `run_conversion` interfaces between base classes and `nwbconverter`

As we have been using run_conversion directly from the interfaces lately I have been discussing this with @CodyCBakerPhD . I am opening this issue to have a template to discuss tomorrow.

I think that we should have a common set of arguments (i.e. unify the keyword arguments that control input-output). Here I mean things like passing an nwbfile or a save_path. Another one is over-write functionality.

Other important considerations listed in no particular order are:

  • The base data interfaces do not perform any validation.
  • Division of responsibilities between spikeinterface and nwb-conversion-tools.

For reference, the signatures of all the run_conversions methods at the moment of writing this issue are:

# NWB general
def run_conversion(
    self,
    metadata: Optional[dict] = None,
    save_to_file: Optional[bool] = True,
    nwbfile_path: Optional[str] = None,
    overwrite: Optional[bool] = False,
    nwbfile: Optional[NWBFile] = None,
    conversion_options: Optional[dict] = None,
)

# Sorting extractor
def run_conversion(
    self,
    nwbfile: NWBFile = None,
    metadata: dict = None,
    stub_test: bool = False,
    save_path: OptionalFilePathType = None,
    overwrite: bool = False,
    write_ecephys_metadata: bool = False,
):

# Recording extractor
def run_conversion(
    self,
    nwbfile: NWBFile = None,
    metadata: dict = None,
    stub_test: bool = False,
    starting_time: Optional[float] = None,
    use_times: bool = False,
    save_path: OptionalFilePathType = None,
    overwrite: bool = False,
    write_as: str = "raw",
    write_electrical_series: bool = True,
    es_key: str = None,
    compression: Optional[str] = "gzip",
    compression_opts: Optional[int] = None,
    iterator_type: Optional[str] = None,
    iterator_opts: Optional[dict] = None,
)

# Movie data interface
def run_conversion(
    self,
    nwbfile: NWBFile,
    metadata: dict,
    stub_test: bool = False,
    external_mode: bool = True,
    starting_times: Optional[list] = None,
    chunk_data: bool = True,
    module_name: Optional[str] = None,
    module_description: Optional[str] = None,
    compression: Optional[str] = "gzip",
    compression_options: Optional[int] = None,
):

# Imaging
def run_conversion(
    self,
    nwbfile: NWBFile = None,
    metadata: dict = None,
    overwrite: bool = False,
    save_path: OptionalFilePathType = None,
):

# Segmentation
def run_conversion(self, nwbfile: NWBFile, metadata: dict, overwrite: bool = False):

[Feature]: save SpikeInterface `WaveformExtractor` object

What would you like to see added to NeuroConv?

Add a function in tools.spikeinterface to write_waveforms from a WaveformExtractor object.

Is your feature request related to a problem?

Yes. Currently there is no way to write mean waveforms and standard deviations using SpikeInterface>=v0.90 (maintained version)

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

Centralize definitions for metadata schemas

Adding this to the global to-do list, low priority. Would be another good first issue to tackle for someone learning the metadata details.

Describe the solution you'd like
Some of the base interfaces, such as the RecordingInterface, explicitly set definitions for their metadata schema structures.

These could and probably should be centralized into a single util location, and can also probably be generalized due to similarities across interfaces.

Checklist

  • Have you ensured the feature or change was not already reported?
  • Have you included a brief and descriptive title?
  • Have you included a clear description of the problem you are trying to solve?
  • Have you checked our Contributing document?

Is `add_electrical_series` module adding incorrect conversion information to the nwbfile?

So, the following recording extractor has no information about its units:

from pynwb.file import NWBFile
from neuroconv.tools.spikeinterface import add_electrical_series
from spikeinterface.core.testing_tools import generate_recording
from datetime import datetime

durations = [3]
sampling_frequency = 1.0
recording = generate_recording(num_channels=2, durations=durations, sampling_frequency=sampling_frequency)
recording.has_scaled_traces()

Its output is
False

However, if we use the method add_electrical_series with this recorder the resulting ElectricalSeries object in the nwbfile looks like this:

testing_session_time = datetime.now().astimezone()
nwbfile = NWBFile(
            session_description="session_description1", identifier="file_id1", session_start_time=testing_session_time
        )



add_electrical_series(recording=recording, nwbfile=nwbfile)

acquisition_module = nwbfile.acquisition
electrical_series = acquisition_module["ElectricalSeries_raw"]
electrical_series

Output:

ElectricalSeries_raw pynwb.ecephys.ElectricalSeries at 0x139998292246144
Fields:
  comments: no comments
  conversion: 1e-06
  data: <hdmf.backends.hdf5.h5_utils.H5DataIO object at 0x7f53abe9afd0>
  description: Raw acquired data
  electrodes: electrodes <class 'hdmf.common.table.DynamicTableRegion'>
  offset: 0.0
  rate: 1.0
  resolution: -1.0
  starting_time: 0.0
  starting_time_unit: seconds
  unit: volts

So, we have written the value of conversion=1e-06 which I think is incorrect. As I understand it, that conversion factor there means that if the stored data were to be multiplied by 1e-06 then the result would be the data in units of volts. But the latter we can not guarantee.

What to do?

Maybe when the spikeinterface recording extractor has_scaled_traces() == False we should just throw a warning and then add in the comments that the time series contained there does not have concrete units.

@bendichter @CodyCBakerPhD What do you think? Is there something I am missing?

[Feature]: Add BEADL support (XML or MATLAB)

What would you like to see added to nwb-conversion-tools?

BEADLE is a cool new standardized behavior format that can support Arduino and a few other hardwares at this point (and will likely grow, can also support generic state-space definitions as well): https://beadl.org/

An extension is already in place for writing the data to NWB: https://github.com/rly/ndx-beadl

We would just need to harness it for interfacing with either the .xml format (https://beadl.org/beadl_xml.html) that defines experimental setups and .mat files that contain the actual results of the experiment (an example file here: https://github.com/rly/ndx-beadl/blob/main/docs/tutorial_nwb_userdays_2022/BeadlData.mat)

Only thing to consider might be that it's still in active development as I understand so hard to tell how fast things are changing with it.

Is your feature request related to a problem?

No response

What solution would you like?

Can specifically harness (in theory, haven't tested it yet) the populate_from_matlab type functions for each extension data type indicated for example by https://github.com/rly/ndx-beadl/blob/main/src/pynwb/ndx_beadl/trials_table.py#L173

Do you have any interest in helping implement the feature?

Yes, but I would need guidance.

Code of Conduct

Add support for loopbio

While I was working with sleap I realized that they support reading videos from this recording device:
http://loopbio.com/recording/

Maybe this is something to support it it becomes popular in neuroscience (I looked through their page and they have a C. Elegans example).

They have some python code already:
https://github.com/loopbio/imgstore

I have not looked deep into this. I am sharing here for keeping a record.

Add support for old format (`smr`) for `CEDRecordingInterface`

Spikeinterface has a recorder supporting the old format (smr, which is different from the extractor supporting the new smrx that we are using).
https://github.com/SpikeInterface/spikeinterface/blob/master/spikeinterface/extractors/neoextractors/spike2.py

We could detect the format in the init of CEDRecordingInterface and chose the recording accordingly to offer support for srm files as well. I am not sure how popular is this recording device and how much the old format is still in used so I don't know what priority this should have.

[Feature]: Arbitrary DataInterface Inputs in YAML

What would you like to see added to nwb-conversion-tools?

Expand the YAML feature to allow arbitrary (non-path) inputs to the data interfaces. Hasn't been needed yet for anything, but became needed to support running spikeextractors_backend with the new NeuroscopeDataInterface. But could be needed for any number of things into the future.

Is your feature request related to a problem?

No response

What solution would you like?

Based on the type specified by the source_schema for the interface, parse it in code as such (consider how json.load might affect how it is read from the .yml).

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

DataInterface for Neuralynx spike sorting formats

What would you like to see added to nwb-conversion-tools?

Is your feature request related to a problem?

No response

What solution would you like?

Do you have any interest in helping implement the feature?

No.

Code of Conduct

Signature of interfaces not propagated to the documentation API

So, for the interfaces we only get this for the function signature:

image

See here for more examples:
https://neuroconv.readthedocs.io/en/main/api/interfaces.html

As this is a problem only for interfaces I suspect it might be an issue arising from the new magic for overriding their interfaces attributes here:

class _LazyExtractorImport(type(BaseDataInterface), type):
def __getattribute__(self, name):
if name == "Extractor" and super().__getattribute__("Extractor") is None:
extractor_module = get_package(package_name=super().__getattribute__("ExtractorModuleName"))
extractor = getattr(
extractor_module,
super().__getattribute__("ExtractorName") or self.__name__.replace("Interface", "Extractor"),
)
return extractor
return super().__getattribute__(name)

Keep internal logs of actions in converter objects

sneakers-the-rat's fork of the repo diverges in a number of ways. Though they used it for a different purpose, it's still worth bringing up as a point of discussion - to keep an internal log of what actions/results a NWBConverter object has undergone.

@luiztauffer Thoughts?

Is your feature request related to a problem? Please describe.
Mostly useful for debugging purposes, to be able to have the user give us the output of printing the internal class logs to see exactly how they were using it through their conversion process.

Describe the solution you'd like
An example snippet of what this might look like

       def run_conversion(self, ..., **kwargs):
       ....

        # prepare dict for storage
        full_spec = dict(
            "metadata"=metadata,
            "save_to_file"=save_to_file,
            ...,
            'kwargs': dict(kwargs)
        )

        spec_key = "run_conversion"
        if spec_key not in self._spec.keys():
            self._spec[spec_key] = [full_spec]
        else:
            self._spec[spec_key].append(full_spec)

        ...

Describe alternatives you've considered
The alternative for helping a user debug their conversion is simply to have them share their code and possibly data as well.

Checklist

  • Have you ensured the feature or change was not already reported?
  • Have you included a brief and descriptive title?
  • Have you included a clear description of the problem you are trying to solve?
  • Have you included a minimal code snippet that reproduces the issue you are encountering?
  • Have you checked our Contributing document?

[Feature]: add facemap conversion

What would you like to see added to NeuroConv?

facemap is pose estimation software similar to DLC and SLEAP, but specific to the face of a mouse. It might be possible to use ndx-pose for this and I think it would be useful to support this in NeuroConv. I believe we could use the output of their test suite as example data.

Is your feature request related to a problem?

No response

Do you have any interest in helping implement the feature?

No.

Code of Conduct

[Feature]: Add Bonsai

What would you like to see added to nwb-conversion-tools?

Source: https://github.com/dandi/neuroconv/blob/master/src/neuroconv/BonsaiRecordingExtractor.py

The extractor would need to be refactored into newer SI standard and propagated to neo if necessary.

Also need to find some example data somewhere, preferably as recently sampled as possible.

Is your feature request related to a problem?

No response

What solution would you like?

Goal is to make a BonsaiRecordingInterface.

Do you have any interest in helping implement the feature?

Yes, but I would need guidance.

Code of Conduct

[Feature]: Integrate with ProbeInterface

What would you like to see added to nwb-conversion-tools?

@D1o0g9s did a great job at the User Days making an extension to ecephys devices to allow attaching a probe geometry (defined as a polygon, only 2D supported ATM): https://github.com/D1o0g9s/ndx-probe-interface

Read support has been requested and outlined on SpikeInterface/probeinterface#12 (comment) steps 1 and 3, with write support being done here on NeuroConv within our spikeinterface tools.

@D1o0g9s did you still have interest/free time to help with these tasks? No pressure if not.

Is your feature request related to a problem?

No response

What solution would you like?

All modern spikeinterface recordings attach a probe to them, which can be loaded either automatically or selected from the vast library currently supported. Thus, we should support writing the extension devices whenever a geometry is specified on the probe. We should also write all the extra electrode columns regarding channel shapes, sizes, etc.

@h-mayorquin can you confirm this sounds about right?

Do you have any interest in helping implement the feature?

Yes, but I would need guidance.

Code of Conduct

Add support for multiple segments recorders in `add_electrical_series`

Right now the get_traces method within the add_electrical_series function located in the spikeinterface module only support the case when the recording is mono-segment.

However, some of the data that we have in gin is multi-segment (specifically I know it is the case for Neuralynx). What we are doing right now is to sub-segment the recorder and only propagate one segment:

https://github.com/catalystneuro/nwb-conversion-tools/blob/ca17dd2146b883d60d641090772f236da1a39869/src/nwb_conversion_tools/datainterfaces/ecephys/neuralynx/neuralynxdatainterface.py#L85-L87

We should modify the function add_electrical_series to account for this and write multiple segments.

I was discussing with @bendichter today and for this simple cases we could just iterate over every segment and write each as its own electrical series.

[Bug]: Latest ProbeInterface broke SpikeGLXInterface

What happened?

https://github.com/catalystneuro/neuroconv/runs/8136438434?check_suite_focus=true

For now, pinning to previous version of ProbeInterface as upper bound.

@h-mayorquin You know the most about this, do you think you can fix it?

Steps to Reproduce

pytest tests/test_on_data/test_gin_ecephys.py

Traceback

No response

Operating System

Linux

Python Executable

Python

Python Version

3.9

Package Versions

Same as full requirements, latest of all. Also included in the log above.

Code of Conduct

Use typing.Literals

What would you like to see added to NeuroConv?

There's a bunch of places throughout NeuroConv (and other packages in the ecosystem, too...) where typing.Literal would make more sense than what the current annotation types are.

However, this annotation type is not supported in Python 3.7 (only 3.8+), so this will have to wait until we drop support for Python 3.7 (likely when security updates stop going out for it in summer 2023).

Is your feature request related to a problem?

No response

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

[Feature]: Add SLEAP

What would you like to see added to nwb-conversion-tools?

The SLEAP team recently added NWB support.

Would be nice to have a DataInterface here for that.

Similar to the way DLC initially did it, none of the logic writing to an in-memory file was peeled out so we would have to do/request that: https://github.com/talmolab/sleap/blob/develop/sleap/io/format/ndx_pose.py#L262-L339

This issue is primarily just for the pose estimation series output for each session, they are also doing a lot of work representing the training data through a new neurodata type as well: rly/ndx-pose#9

They have testing data available from their repo: https://github.com/talmolab/sleap/tree/develop/tests/data for example usage of it refer to https://github.com/talmolab/sleap/blob/e2ec3b3b9eec243d943f46ced6e61e3c4f69b27a/tests/io/test_formats.py#L365-L399

Is your feature request related to a problem?

No response

What solution would you like?

Goals is to make a SLEAPInterface for interfacing with their data.

Do you have any interest in helping implement the feature?

Yes, but I would need guidance.

Code of Conduct

[Feature]: Create general purpose data interface for audio files

What would you like to see added to NeuroConv?

The data interface for writing audio data using ndx-sound to NWB in catalystneuro/fee-lab-to-nwb#20 can be generalized to work for any audio file. This can be re-used in future projects.

Is your feature request related to a problem?

No response

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

[Feature]: Add NWB offsets write support for SpikeInterface.write_recording()

What would you like to see added to nwb-conversion-tools?

With the newly released version of PyNWB, writing offsets to TimeSeries (and crucially ElectricalSeries) is now possible and should be implemented in our tools for writing SpikeInterface recordings.

Is your feature request related to a problem?

No response

What solution would you like?

I would like to set offsets at the same time as conversion factors whenever they are non-zero when fetched from the recording object being written.

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

Adding new numpy typing conflicts with `get_source_schema`

Recently numpy has introduced new typing for array and dtypes:

https://numpy.org/devdocs/reference/typing.html

However, our machinery for calculating the source schema finds some bugs because of (at least right now) two reasons:
1 ) They are made of the Union (in the typing sense) of many types and some of those do not have an attribute name which we use in our get_schema_from_method_signature function:

https://github.com/catalystneuro/nwb-conversion-tools/blob/16126dbd3039444a3c74fa8a371cc43a596b07e5/src/nwb_conversion_tools/utils/json_schema.py#L76

  1. Sometimes there are multiple hits with respect to our types to json maping and the following lines fail.
    https://github.com/catalystneuro/nwb-conversion-tools/blob/16126dbd3039444a3c74fa8a371cc43a596b07e5/src/nwb_conversion_tools/utils/json_schema.py#L81

I opened PR catalystneuro/nwb-conversion-tools#544 to showcase this issue.

[Bug]: `write_recording()` fails with `NWBZarrIO` backend

What happened?

When trying to write a recording object using the NWBZarrIO backend from hdmf-zarr, I get the following error related to the len of the SpikeInterfaceRecordingDataChunkIterator.

The same nwbfile does not produce any error when writing to HDF5.

@oruebel @rly I think this might be due to the way neuroconv handles the data iteration, but tagging you here in case it's a Zarr backend issue.

Steps to Reproduce

from pynwb import NWBFile
from hdmf_zarr.nwb import NWBZarrIO
from datetime import datetime
from neuroconv.tools.spikeinterface import write_recording

import spikeinterface.full as si


# create toy objects
rec, sort = si.toy_example()

# instantiate nwbfile
nwb_metadata = dict(session_description="toy_example", identifier="tpy", session_start_time=datetime.now())
nwbfile = NWBFile(**nwb_metadata)

# write recording
write_recording(rec, nwbfile=nwbfile)

# write to Zarr
with NWBZarrIO('test_recording_zarr.nwb', "w") as io:
    io.write(nwbfile)

Traceback

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:904, in ZarrIO.__list_fill__(self, parent, name, data, options)
    903 try:
--> 904     dset[:] = data  # If data is an h5py.Dataset then this will copy the data
    905 # For compound data types containing strings Zarr sometimes does not like wirting multiple values
    906 # try to write them one-at-a-time instead then

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/zarr/core.py:1373, in Array.__setitem__(self, selection, value)
   1372 else:
-> 1373     self.set_basic_selection(pure_selection, value, fields=fields)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/zarr/core.py:1468, in Array.set_basic_selection(self, selection, value, fields)
   1467 else:
-> 1468     return self._set_basic_selection_nd(selection, value, fields=fields)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/zarr/core.py:1772, in Array._set_basic_selection_nd(self, selection, value, fields)
   1770 indexer = BasicIndexer(selection, self)
-> 1772 self._set_selection(indexer, value, fields=fields)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/zarr/core.py:1800, in Array._set_selection(self, indexer, value, fields)
   1799         value = np.asanyarray(value, like=self._meta_array)
-> 1800     check_array_shape('value', value, sel_shape)
   1802 # iterate over chunks in range

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/zarr/util.py:547, in check_array_shape(param, array, shape)
    546 if array.shape != shape:
--> 547     raise ValueError('parameter {!r}: expected array with shape {!r}, got {!r}'
    548                      .format(param, shape, array.shape))

ValueError: parameter 'value': expected array with shape (300000, 4), got ()

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
Cell In [22], line 2
      1 with NWBZarrIO('nwb-test-files/test_recording_zarr.nwb', "w") as io:
----> 2     io.write(nwbfile)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:151, in ZarrIO.write(self, **kwargs)
    149 """Overwrite the write method to add support for caching the specification"""
    150 cache_spec = popargs('cache_spec', kwargs)
--> 151 super(ZarrIO, self).write(**kwargs)
    152 if cache_spec:
    153     self.__cache_spec()

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/backends/io.py:51, in HDMFIO.write(self, **kwargs)
     49 container = popargs('container', kwargs)
     50 f_builder = self.__manager.build(container, source=self.__source, root=True)
---> 51 self.write_builder(f_builder, **kwargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:248, in ZarrIO.write_builder(self, **kwargs)
    246 f_builder, link_data, exhaust_dci = getargs('builder', 'link_data', 'exhaust_dci', kwargs)
    247 for name, gbldr in f_builder.groups.items():
--> 248     self.write_group(parent=self.__file,
    249                      builder=gbldr,
    250                      link_data=link_data,
    251                      exhaust_dci=exhaust_dci)
    252 for name, dbldr in f_builder.datasets.items():
    253     self.write_dataset(parent=self.__file,
    254                        builder=dbldr,
    255                        link_data=link_data,
    256                        exhaust_dci=exhaust_dci)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:283, in ZarrIO.write_group(self, **kwargs)
    281 if subgroups:
    282     for subgroup_name, sub_builder in subgroups.items():
--> 283         self.write_group(parent=group,
    284                          builder=sub_builder,
    285                          link_data=link_data,
    286                          exhaust_dci=exhaust_dci)
    288 datasets = builder.datasets
    289 if datasets:

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:291, in ZarrIO.write_group(self, **kwargs)
    289 if datasets:
    290     for dset_name, sub_builder in datasets.items():
--> 291         self.write_dataset(parent=group,
    292                            builder=sub_builder,
    293                            link_data=link_data,
    294                            exhaust_dci=exhaust_dci)
    296 # write all links (haven implemented)
    297 links = builder.links

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/utils.py:645, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    643 def func_call(*args, **kwargs):
    644     pargs = _check_args(args, kwargs)
--> 645     return func(args[0], **pargs)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:736, in ZarrIO.write_dataset(self, **kwargs)
    734     self.__dci_queue.append(dataset=dset, data=data)
    735 elif hasattr(data, '__len__'):
--> 736     dset = self.__list_fill__(parent, name, data, options)
    737 else:
    738     dset = self.__scalar_fill__(parent, name, data, options)

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf_zarr/backend.py:908, in ZarrIO.__list_fill__(self, parent, name, data, options)
    905     # For compound data types containing strings Zarr sometimes does not like wirting multiple values
    906     # try to write them one-at-a-time instead then
    907     except ValueError:
--> 908         for i in range(len(data)):
    909             dset[i] = data[i]
    910 return dset

File ~/anaconda3/envs/nwb/lib/python3.9/site-packages/hdmf/data_utils.py:1028, in DataIO.__len__(self)
   1026 if not self.valid:
   1027     raise InvalidDataIOError("Cannot get length of data. Data is not valid.")
-> 1028 return len(self.data)

TypeError: object of type 'SpikeInterfaceRecordingDataChunkIterator' has no len()

Operating System

Linux

Python Executable

Python

Python Version

3.9

Package Versions

No response

Code of Conduct

[Feature]: form for requesting new format

What would you like to see added to NeuroConv?

This would be a feature but a specific kind of feature that could use its own issue form, including a request for the link to the project repo and example data. We could then include this form at the end of the gallery so anyone that looks for their format in the gallery and does not find it can submit a request easily.

Is your feature request related to a problem?

No response

Do you have any interest in helping implement the feature?

No.

Code of Conduct

gin data for CED has non-electrical series data

Hi, I am looking at the gin data for CED and I found the following:

# This generates an error
from pathlib import Path
from spikeinterface.extractors import CedRecordingExtractor

DATA_PATH = Path("/home/heberto/ephy_testing_data/")
file_path = DATA_PATH / "spike2" / "m365_1sec.smrx"
recorder = CedRecordingExtractor(file_path=file_path, stream_id=None)

recorder.neo_reader.header["signal_channels"][-10:]

The output is:

array([('RhdD-59', '62', 30030.03003003, 'int16', 'mV', 1.95312500e-04, 0.  , '0'),
       ('RhdD-60', '63', 30030.03003003, 'int16', 'mV', 1.95312500e-04, 0.  , '0'),
       ('RhdD-61', '64', 30030.03003003, 'int16', 'mV', 1.95312500e-04, 0.  , '0'),
       ('RhdD-62', '65', 30030.03003003, 'int16', 'mV', 1.95312500e-04, 0.  , '0'),
       ('RhdD-63', '66', 30030.03003003, 'int16', 'mV', 1.95312500e-04, 0.  , '0'),
       ('CED_Mech', '67', 30030.03003003, 'int16', 'g', 2.04467773e-03, 0.  , '0'),
       ('LFP', '68', 30030.03003003, 'int16', 'V', 5.03540039e-05, 1.65, '0'),
       ('MechTTL', '70', 30030.03003003, 'int16', 'V', 5.03540039e-05, 1.65, '0'),
       ('MechStim', '71', 30030.03003003, 'int16', 'V', 5.03540039e-05, 1.65, '0'),
       ('Laser', '72', 30030.03003003, 'int16', 'V', 5.03540039e-05, 1.65, '0')],
      dtype=[('name', '<U64'), ('id', '<U64'), ('sampling_rate', '<f8'), ('dtype', '<U16'), ('units', '<U64'), ('gain', '<f8'), ('offset', '<f8'), ('stream_id', '<U64')])

So, some of the channels are not electrical series. We have events, laser and mechanical stimulation. Plus, channel 68 is LFP. It appears to me that these channels should not be written as an electrical series but I don't know how standard the channel naming is and how to deal with it. Maybe someone who knows the format more could advice.

Related:
NeuralEnsemble/python-neo#1133

[Feature]: support python 3.11

What would you like to see added to NeuroConv?

Can we try this and see if it works as-is? Maybe we can have it as a test that is not required to pass?

Is your feature request related to a problem?

No response

Do you have any interest in helping implement the feature?

No.

Code of Conduct

[Documentation]: Combine /documentation into /docs

What would you like changed or added to the documentation and why?

@h-mayorquin or @weiglszonja, could one of you tackle this when you have the time?

Basically just merge the content of the .md file + .png found in https://github.com/catalystneuro/neuroconv/tree/main/documentation into the Readthedocs folder /docs so it's centralized.

Do you have any interest in helping write or edit the documentation?

No.

Code of Conduct

[Bug]: `BlackrockRecordingInterface` fails to read file

What happened?

I'm trying to convert Blackrock recording files with NeuroConv and I got some issues with the Neo package.
I posted an issue there: NeuralEnsemble/python-neo#1182.
The first error might be more specifically linked to BlackrockRecordingInterface, so I added the traceback below.

Steps to Reproduce

from pathlib import Path
from neuroconv.datainterfaces import BlackrockRecordingInterface
filePath = f"{fileDir}/fileName.ns6"
interface = BlackrockRecordingInterface(file_path=filePath, verbose=False)

Traceback

/home/wanglab/mambaforge3/envs/wanglabnwb/lib/python3.9/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [2], in <cell line: 7>()
      5 filePath = f"{projectDir + subjectDir + sessionDir}sc099_1217_1800.ns6"
      6 # Change the file_path to the location in your system
----> 7 interface = BlackrockRecordingInterface(file_path=filePath, verbose=False)

File ~/mambaforge3/envs/wanglabnwb/lib/python3.9/site-packages/neuroconv/datainterfaces/ecephys/blackrock/blackrockdatainterface.py:78, in BlackrockRecordingInterface.__init__(self, file_path, nsx_override, verbose, spikeextractors_backend)
     75 spikeinterface = get_package(package_name="spikeinterface")
     77 self.RX = spikeinterface.extractors.BlackrockRecordingExtractor
---> 78 super().__init__(file_path=file_path, verbose=verbose)

File ~/mambaforge3/envs/wanglabnwb/lib/python3.9/site-packages/neuroconv/datainterfaces/ecephys/baserecordingextractorinterface.py:20, in BaseRecordingExtractorInterface.__init__(self, verbose, **source_data)
     18 def __init__(self, verbose: bool = True, **source_data):
     19     super().__init__(**source_data)
---> 20     self.recording_extractor = self.Extractor(**source_data)
     21     self.subset_channels = None
     22     self.verbose = verbose

File ~/mambaforge3/envs/wanglabnwb/lib/python3.9/site-packages/spikeinterface/extractors/neoextractors/blackrock.py:29, in BlackrockRecordingExtractor.__init__(self, file_path, stream_id, stream_name, block_index, all_annotations)
     27 def __init__(self, file_path, stream_id=None, stream_name=None, block_index=None, all_annotations=False):
     28     neo_kwargs = self.map_to_neo_kwargs(file_path)
---> 29     NeoBaseRecordingExtractor.__init__(self, stream_id=stream_id, 
     30                                        stream_name=stream_name,
     31                                        all_annotations=all_annotations,
     32                                        **neo_kwargs)
     33     self._kwargs.update({'file_path': str(file_path)})

File ~/mambaforge3/envs/wanglabnwb/lib/python3.9/site-packages/spikeinterface/extractors/neoextractors/neobaseextractor.py:59, in NeoBaseRecordingExtractor.__init__(self, stream_id, stream_name, block_index, all_annotations, **neo_kwargs)
     57 if stream_id is None and stream_name is None:
     58     if stream_channels.size > 1:
---> 59         raise ValueError(f"This reader have several streams: \nNames: {stream_names}\nIDs: {stream_ids}. "
     60                          f"Specify it with the 'stram_name' or 'stream_id' arguments")
     61     else:
     62         stream_id = stream_ids[0]

ValueError: This reader have several streams: 
Names: ['nsx2', 'nsx6']
IDs: ['2', '6']. Specify it with the 'stram_name' or 'stream_id' arguments

Operating System

Linux

Python Executable

Conda

Python Version

3.9

Package Versions

environment_for_issue.txt

Code of Conduct

semi-specified buffer size

For a GenericDataChunkIterator, I'd like to be able to set a buffer shape to:

buffer_shape=(1, None)
buffer_gb=10

This would be useful when you know you only want to read one channel at a time, but don't know how many time points you want to read.

Neuralynx metadata improvements

During the discussion of #170 some things were left to-do. I am writing them here to keep track.

  • Add capability to add_electrodes to write collumns with boolean values (and remove the code from this on neuralynx)
  • Check whether NeuralynxRecordingInterface, initialization or writing process fails when non ncs files are present in the folder_path.

Improve VideoInterface

Writing here a series of tasks to do for improving the movie interface:

  • Add an example to the conversion gallery #183
  • Add timestamps stubbing. #181
  • Add another interface that converts a single movie file. The one we have now takes a list of movies and iterates over each one. Having a single movie interface would allow a simpler API for users, be easier to test, and allow us to do a good refactor of the current one
  • Add a video example with variable rate to the gin-tests. Timestamp extraction and use with variable timestamps is not used..
  • Add an example of the single movie interface to the conversion gallery. I think this case is way more common and does not require the users to wrap their file paths into a list.
  • Separate the backend extraction from the conversion code. Right now the code for writing to nwb is tangled and mixed with the code for extracting properties from the videos with OpenCV. It is good practice to reduce the point of contact between functions (interfaces should touch each other at one place). This will simplify the code and if done well allow us to change the backend when needed without big headaches.
  • Some videos have a starting time than OpenCV does not extract. We should look for an example of this, add it to gin tests and extract it automatically.

[When min PyNWB >=v2.1.0]: remove manual fillers for optional ElectrodeTable cols

What would you like to see added to nwb-conversion-tools?

With PyNWB v2.1.0, various electrode columns (x,y,z, location, imp) are no longer required so we could remove those corresponding lines of our write_recording tools using coded defaults to just not propagate them at all if the recording object doesn't have the information set, and let PyNWB handle all the setting automatically.

What we need to decide (likely best at next group meeting) is if we want to enforce the minimal version of the conversion tools to use most recent of PyNWB, et al.

Is your feature request related to a problem?

No response

What solution would you like?

Remove https://github.com/catalystneuro/nwb-conversion-tools/blob/main/src/nwb_conversion_tools/tools/spikeinterface/spikeinterface.py#L368-L380 and dependent lines.

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

Simplify conversions with multiple interfaces

Right now we have the conversion gallery which as simple as possible illustrates how to do conversion with a single interface:
https://neuroconv.readthedocs.io/en/main/conversion_examples_gallery/conversion_example_gallery.html#extracellular-electrophysiology

On the other hand, our workwflow for handling conversions with multiple interfaces relies on the NWBConverter object for which we also have another specific tutorial:
https://neuroconv.readthedocs.io/en/main/user_guide/nwbconverter.html

In my view, the use of the NWBConverter is not as simple as it could be. To be specific, two complexities that I see to the end user are the following:

  1. They need to define a class themselves and inherit from the converter
  2. To define all the properties they need to use a structure of nested dictionaries for the source data, conversion options and metadata.

I feel that this creates an unnecessary gap in complexity between A single interface conversion and a multiple interface conversion. Moreover, the workflow for multiple interfaces does not build on the single interface one which seems like a lost purpose. Concretely, the single interface conversions rely on their specific interfaces for conversion whereas the multiple ones rely on the NWBConverter object. It would be great if we could simplify the step from single to multiple interface and that the multiple interface conversions built on what we have in the conversion gallery already. I want to propose some solutions that go in this direction:

  1. The first solution is to adapt the NWBConverter object to take as an input previously initialized interfaces. This allows us to mostly rely on the machinery that we have already in place while at the same time it allows users to pass their already initialized interfaces (that they copy-paste from the conversion gallery) to the NWBConveter object to build more complex pipelines. #164 shows a prototype for this.
  2. @bendichter has mentioned a couple of times that he takes inspiration from scikit-learn and I think we can follow the same course here. We could use something like their pipeline objects for users to concatenate conversions. Of course, the pipelines would rely on NWBConverter behind the scenes and this is just a matter of simplifying the interaction with the object. The advantage of this approach is that it relies on another widely used data model that is out there in a popular library so we can leverage this knowledge from the community.
  3. Since we introduced contexts to handle writing most of our conversion can and some do return an nwb-file with the data already attached to them. We could leverage this capability and instruct the users that they can chain their single interface conversions to attach data from other interfaces to the same file.

Example of pipeline:

build nwb_file with pynwb or by running the first conversion with `nwbfile_path` instead.  # Step 1
nwbfile = interface1.run_conversion(nwbfile)  # Step 2 
nwbfile = interface2.run_conversion(nwbfile)  # Step 3 
save the file to disk   # Step 4

The drawback of this last approach is that we lost all the work that we did with the NWBConverter object. Moreover, we have less control of how the data is combined and therefore is harder to test.

[Feature]: add favicon to documentation

What would you like to see added to NeuroConv?

I suppose this could just be the NWB logo for now, similar to PyNWB

image

Is your feature request related to a problem?

No response

Do you have any interest in helping implement the feature?

No.

Code of Conduct

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.