lorenfranklab / rec_to_nwb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from novelaneuro/rec_to_nwb

2.0 2.0 8.0 5.18 MB

Data Migration REC -> NWB 2.0 Service

License: Other

Python 98.08% Shell 1.37% Jupyter Notebook 0.54% Batchfile 0.02%

rec_to_nwb's People

Contributors

Stargazers

Watchers

Forkers

jihyunbak emk-lab asilvaalex4 zoldello shijiegu michaelcoulter acomrie samuelbray32

rec_to_nwb's Issues

Make unique names for log files

Log files are currently saved as rec_to_nwb.log; would be great to append experiment date and potentially animal name so that each log file has a unique name corresponding to each unique nwb file for debugging/referencing info about each nwb file's generation.

Index error in position_originator when generating nwb file

I got the following index error in position_originator when generating an nwb file ("peanut20201109_.nwb"). Thanks in advance for any thoughts on this.

`/home/jguidera/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/rec_to_binaries/read_binaries.py:73: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
return np.dtype(typearr)

KeyError Traceback (most recent call last)
/tmp/ipykernel_2347308/508651778.py in
46 trodes_rec_export_args=trodes_rec_export_args)
47
---> 48 content = builder.build_nwb()
49 print(content)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
230 process_mda_invalid_time=process_mda_invalid_time,
231 process_pos_valid_time=process_pos_valid_time,
--> 232 process_pos_invalid_time=process_pos_invalid_time)
233
234 logger.info('Done...\n')

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
245 logger.info('Date: {}'.format(date))
246 nwb_builder = self.get_nwb_builder(date)
--> 247 content = nwb_builder.build()
248 nwb_builder.write(content)
249 if self.is_old_dataset:

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
369 self.associated_files_originator.make(nwb_content)
370
--> 371 self.position_originator.make(nwb_content)
372
373 valid_map_dict = self.__build_corrupted_data_manager()

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
50 zip(meters_per_pixels, position_tracking_paths)):
51 position_df = self.get_position_with_corrected_timestamps(
---> 52 position_tracking_path)
53 position.create_spatial_series(
54 name=f'series_{series_id}',

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_position_with_corrected_timestamps(position_tracking_path)
89
90 dio_systime = np.asarray(
---> 91 mcu_neural_timestamps.loc[dio_camera_ticks])
92 pause_mid_time = find_acquisition_timing_pause(dio_systime)
93 frame_rate_from_dio = get_framerate(

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in getitem(self, key)
929
930 maybe_callable = com.apply_if_callable(key, self.obj)
--> 931 return self._getitem_axis(maybe_callable, axis=axis)
932
933 def _is_scalar_access(self, key: tuple):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
1151 raise ValueError("Cannot index with multidimensional key")
1152
-> 1153 return self._getitem_iterable(key, axis=axis)
1154
1155 # nested tuple slicing

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
1091
1092 # A collection of keys
-> 1093 keyarr, indexer = self._get_listlike_indexer(key, axis)
1094 return self.obj._reindex_with_indexers(
1095 {axis: [keyarr, indexer]}, copy=True, allow_dups=True

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis)
1312 keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
1313
-> 1314 self._validate_read_indexer(keyarr, indexer, axis)
1315
1316 if needs_i8_conversion(ax.dtype) or isinstance(

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis)
1375
1376 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
-> 1377 raise KeyError(f"{not_found} not in index")
1378
1379

KeyError: '[43885432] not in index'`

get_electrode_indices() is incorrect

This function does not return the indices of electrode by their IDs in its current form and it needs to be fixed. The fix either involves sending in the nwbfile itself so that the IDs can be looked up in the main NWB electrode table or somehow getting those IDs from the electrode table region in the electrical series, but I don't know how to do that.

This will cause LFP extraction to be incorrect in some cases where there are bad channels, so it needs to be fixed asap.

"Overwrite" errors when there is previously extracted data in the folder.

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/rec_to_binaries/trodes_data.py in _extract_rec_generic(self, export_cmd, export_dir_ext, dates, epochs, export_args, overwrite, stop_error, use_folder_date, parallel_instances, use_day_config)
   1368                     if os.path.exists(out_epoch_dir):
   1369                         if overwrite:
-> 1370                             shutil.rmtree(out_epoch_dir)
   1371                         else:
   1372                             raise TrodesDataFormatError(

~/anaconda3/envs/rec_to_nwb/lib/python3.7/shutil.py in rmtree(path, ignore_errors, onerror)
    496                     os.rmdir(path)
    497                 except OSError:
--> 498                     onerror(os.rmdir, path, sys.exc_info())
    499             else:
    500                 try:

~/anaconda3/envs/rec_to_nwb/lib/python3.7/shutil.py in rmtree(path, ignore_errors, onerror)
    494                 _rmtree_safe_fd(fd, path, onerror)
    495                 try:
--> 496                     os.rmdir(path)
    497                 except OSError:
    498                     onerror(os.rmdir, path, sys.exc_info())

OSError: [Errno 39] Directory not empty: '/opt/stelmo/shijie/recording_pilot/molly/preprocessing/20220415/20220415_molly_02_SeqSession1.time'

error for old datasets with missing cameraHWSync files

This is a duplicate of NovelaNeuro#548.

Context

This is an "old dataset" issue.

Prerequisite

This dataset does not contain .videoTimeStamps.cameraHWSync files as in the newer datasets.

Instead, it contains .videoTimeStamps.cameraHWFrameCount files. This is an example epoch in this dataset:

20170120_KF2_02_r1.1.h264
20170120_KF2_02_r1.1.videoPositionTracking
20170120_KF2_02_r1.1.videoTimeStamps
20170120_KF2_02_r1.1.videoTimeStamps.cameraHWFrameCount
20170120_KF2_02_r1.rec
20170120_KF2_02_r1.stateScriptLog

Expected behavior

Builder should load cameraHWFrameCount file should and use it to determine the timestamps.

Current behavior

Builder looks for cameraHWSync files, and raises a FileNotFoundError.

Error message

Traceback:

~/proj/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
    300         self.camera_device_originator.make(nwb_content)
    301 
--> 302         self.video_files_originator.make(nwb_content)
    303 
    304         electrode_groups = self.electrode_group_originator.make(

~/proj/rec_to_nwb/rec_to_nwb/processing/builder/originators/video_files_originator.py in make(self, nwb_content)
     11 
     12     def make(self, nwb_content):
---> 13         fl_video_files = self.fl_video_files_manager.get_video_files()
     14         image_series_list = [
     15             VideoFilesCreator.create(fl_video_file, self.video_directory, nwb_content)

~/proj/rec_to_nwb/rec_to_nwb/processing/nwb/components/video_files/fl_video_files_manager.py in get_video_files(self)
     15 
     16     def get_video_files(self):
---> 17         extracted_video_files = self.fl_video_files_extractor.extract_video_files()
     18         return [
     19             self.fl_video_files_builder.build(

~/proj/rec_to_nwb/rec_to_nwb/processing/nwb/components/video_files/fl_video_files_extractor.py in extract_video_files(self)
     21                     self.raw_data_path + "/"
     22                     + video_file["name"][:-4]
---> 23                     + "videoTimeStamps.cameraHWSync"
     24                 )["data"]),
     25                 "device": video_file["camera_id"]

~/proj/rec_to_binaries/rec_to_binaries/read_binaries.py in readTrodesExtractedDataFile(filename)
     17 
     18     '''
---> 19     with open(filename, 'rb') as file:
     20         # Check if first line is start of settings block
     21         if file.readline().decode().strip() != '<Start settings>':

FileNotFoundError: [Errno 2] No such file or directory: '/home/jhbak/Data/rec_pilot//KF2/raw/20170120//20170120_KF2_01_s1.1.videoTimeStamps.cameraHWSync'

Video timestamps for non-ptp datasets are incorrect

This code currently just uses the HWTimestamps which are incorrect when there is no PTP.

error in position_originator.py

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    230             process_mda_invalid_time=process_mda_invalid_time,
    231             process_pos_valid_time=process_pos_valid_time,
--> 232             process_pos_invalid_time=process_pos_invalid_time)
    233 
    234         logger.info('Done...\n')

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    245             logger.info('Date: {}'.format(date))
    246             nwb_builder = self.get_nwb_builder(date)
--> 247             content = nwb_builder.build()
    248             nwb_builder.write(content)
    249             if self.is_old_dataset:

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
    369             self.associated_files_originator.make(nwb_content)
    370 
--> 371         self.position_originator.make(nwb_content)
    372 
    373         valid_map_dict = self.__build_corrupted_data_manager()

~/Documents/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
     50                 zip(meters_per_pixels, pos_online_paths)):
     51             position_df = self.get_position_with_corrected_timestamps(
---> 52                 pos_online_path)
     53             position.create_spatial_series(
     54                 name=f'series_{series_id}',

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_position_with_corrected_timestamps(pos_online_path)
     92         dio_systime = np.asarray(continuous_time.loc[dio_camera_ticks])
     93 
---> 94         pause_mid_time = find_acquisition_timing_pause(dio_systime)
     95 
     96         ptp_systime = np.asarray(camera_hwsync.HWTimestamp)

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in find_acquisition_timing_pause(timestamps, min_duration, max_duration, n_search)
    278     is_valid_gap = (timestamp_difference > min_duration) & (
    279         timestamp_difference < max_duration)
--> 280     pause_start_ind = np.nonzero(is_valid_gap)[0][0]
    281     pause_end_ind = pause_start_ind + 1
    282     pause_mid_time = (

IndexError: index 0 is out of bounds for axis 0 with size 0

Video data not filled in

I have some nwb files that appear to be missing significant video information. These nwb files were created in ~Dec 21/Jan 22 using rec_to_nwb. When I open an example nwb file as nwbf using pynwb, the video-related section looks like so:

video_files pynwb.base.ProcessingModule at 0x140233856552816
Fields:
  data_interfaces: {
    video <class 'pynwb.behavior.BehavioralEvents'>
  }
  description: Contains all associated video files data

@rly please let us know if you have any ideas about what the issue might be in the video file section of rec_to_nwb. Thanks much for any help!

For a bit more context -

As a result of this, there's no video info to access using get_data_interface(), so I can't populate video file information in our DataJoint database system. That is, there are no Fields in video if I run the following:
video = get_data_interface(nwbf, 'video', pynwb.behavior.BehavioralEvents)

And, in case useful, an example nwbf overall looks like so:

root pynwb.file.NWBFile at 0x140233856821808
Fields:
  acquisition: {
    e-series <class 'pynwb.ecephys.ElectricalSeries'>
  }
  devices: {
    camera_device 0 <class 'abc.CameraDevice'>,
    camera_device 1 <class 'abc.CameraDevice'>,
    dataacq_device0 <class 'abc.DataAcqDevice'>,
    header_device <class 'abc.HeaderDevice'>,
    probe 0 <class 'abc.Probe'>,
    probe 1 <class 'abc.Probe'>
  }
  electrode_groups: {
    0 <class 'abc.NwbElectrodeGroup'>,
    1 <class 'abc.NwbElectrodeGroup'>,
    10 <class 'abc.NwbElectrodeGroup'>,
    11 <class 'abc.NwbElectrodeGroup'>,
    12 <class 'abc.NwbElectrodeGroup'>,
    13 <class 'abc.NwbElectrodeGroup'>,
    14 <class 'abc.NwbElectrodeGroup'>,
    15 <class 'abc.NwbElectrodeGroup'>,
    16 <class 'abc.NwbElectrodeGroup'>,
    17 <class 'abc.NwbElectrodeGroup'>,
    18 <class 'abc.NwbElectrodeGroup'>,
    19 <class 'abc.NwbElectrodeGroup'>,
    2 <class 'abc.NwbElectrodeGroup'>,
    20 <class 'abc.NwbElectrodeGroup'>,
    21 <class 'abc.NwbElectrodeGroup'>,
    22 <class 'abc.NwbElectrodeGroup'>,
    23 <class 'abc.NwbElectrodeGroup'>,
    24 <class 'abc.NwbElectrodeGroup'>,
    25 <class 'abc.NwbElectrodeGroup'>,
    3 <class 'abc.NwbElectrodeGroup'>,
    4 <class 'abc.NwbElectrodeGroup'>,
    5 <class 'abc.NwbElectrodeGroup'>,
    6 <class 'abc.NwbElectrodeGroup'>,
    7 <class 'abc.NwbElectrodeGroup'>,
    8 <class 'abc.NwbElectrodeGroup'>,
    9 <class 'abc.NwbElectrodeGroup'>
  }
  electrodes: electrodes <class 'hdmf.common.table.DynamicTable'>
  epochs: epochs <class 'pynwb.epoch.TimeIntervals'>
  experiment_description: Memory and value guided decision making
  experimenter: ['Alison Comrie']
  file_create_date: [datetime.datetime(2021, 12, 22, 11, 29, 56, 77853, tzinfo=tzoffset(None, -28800))]
  identifier: 8b3cbede-635d-11ec-8b32-79740f285a96
  institution: University of California, San Francisco
  intervals: {
    epochs <class 'pynwb.epoch.TimeIntervals'>
  }
  lab: Loren Frank
  processing: {
    analog <class 'pynwb.base.ProcessingModule'>,
    associated_files <class 'pynwb.base.ProcessingModule'>,
    behavior <class 'pynwb.base.ProcessingModule'>,
    camera_sample_frame_counts <class 'pynwb.base.ProcessingModule'>,
    sample_count <class 'pynwb.base.ProcessingModule'>,
    tasks <class 'pynwb.base.ProcessingModule'>,
    video_files <class 'pynwb.base.ProcessingModule'>
  }
  session_description: Spatial bandit task (regular)
  session_id: j16_20210706
  session_start_time: 2021-07-06 09:24:39.436000-07:00
  subject: subject pynwb.file.Subject at 0x140233856044768
Fields:
  description: Long Evans Rat
  genotype: Wild Type
  sex: Male
  species: Rat
  subject_id: j16
  weight: 566g

  timestamps_reference_time: 1970-01-01 00:00:00+00:00

Can't process days without .sc files

The pipeline does not allow the associated_files section of the metadata.yml to be empty, and also does not allow for that section to be removed. This is relevant for creating .nwb files for days where no .sc files need to be associated with the .rec file and other data.

Order of .part files incorrect

Generating preprocessing files for a recording with .part files (20201104_peanut_20_h2) seems to have failed from the .part files being appended out of order. This seems to have happened because part10 and part11 are viewed as earlier than part2, part3 etc. I was not able to find a section of code in rec_to_nwb where the sorting of .part files happens, so I suspect this may occur in rec_to_binaries. @edeno I am wondering if you could confirm this. Thanks in advance for any help with this.

(ID: 19) Appending from file: "/nimbus/jguidera/peanut/raw/20201104/20201104_peanut_20_h2.part10.rec"
(ID: 19) Calculating time offset for appended file...
(ID: 19) WARNING: this file does not contain information about the date and time when the file was created, so if the system clock was reset there is no way of knowing how much time elapsed between the two files. No offset will be applied.
(ID: 19) 0 %
(ID: 19) 5 %
(ID: 19) 10 %
(ID: 19) 15 %
(ID: 19) 20 %
(ID: 19) 25 %
(ID: 19) 30 %
(ID: 19) 35 %
(ID: 19) 40 %
(ID: 19) 45 %
(ID: 19) 50 %
(ID: 19) 55 %
(ID: 19) 60 %
(ID: 19) 65 %
(ID: 19) Gap of 94491 found. Packet: 71081448
(ID: 19) 70 %
(ID: 19) 75 %
(ID: 19) 80 %
(ID: 19) 85 %
(ID: 19) 90 %
(ID: 19) 95 %
(ID: 19)
(ID: 19) Appending from file: "/nimbus/jguidera/peanut/raw/20201104/20201104_peanut_20_h2.part11.rec"
(ID: 19) Calculating time offset for appended file...
(ID: 19) WARNING: this file does not contain information about the date and time when the file was created, so if the system clock was reset there is no way of knowing how much time elapsed between the two files. No offset will be applied.
(ID: 19)
(ID: 19) Appending from file: "/nimbus/jguidera/peanut/raw/20201104/20201104_peanut_20_h2.part2.rec"
(ID: 19) Calculating time offset for appended file...
(ID: 19) WARNING: this file does not contain information about the date and time when the file was created, so if the system clock was reset there is no way of knowing how much time elapsed between the two files. No offset will be applied.
(ID: 19) Error: timestamps do not begin with greater value than the end of the last file. Aborting.
(ID: 19)
(ID: 19) Done
(ID: 19)
(ID: 19) Done
(ID: 19)
(ID: 19) Done

error in position_originator related to valid gaps

When generating an nwb file, I got the following error. It looks like the code was expecting is_valid_gap to be non-empty, but it was empty in my case. Thanks in advance for any thoughts about what could be going on here.

`---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/tmp/ipykernel_3384249/1204151749.py in
46 trodes_rec_export_args=trodes_rec_export_args)
47
---> 48 content = builder.build_nwb()
49 print(content)

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
50 zip(meters_per_pixels, pos_online_paths)):
51 position_df = self.get_position_with_corrected_timestamps(
---> 52 pos_online_path)
53 position.create_spatial_series(
54 name=f'series_{series_id}',

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_position_with_corrected_timestamps(pos_online_path)
92 dio_systime = np.asarray(continuous_time.loc[dio_camera_ticks])
93
---> 94 pause_mid_time = find_acquisition_timing_pause(dio_systime)
95
96 ptp_systime = np.asarray(camera_hwsync.HWTimestamp)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in find_acquisition_timing_pause(timestamps, min_duration, max_duration, n_search)
278 is_valid_gap = (timestamp_difference > min_duration) & (
279 timestamp_difference < max_duration)
--> 280 pause_start_ind = np.nonzero(is_valid_gap)[0][0]
281 pause_end_ind = pause_start_ind + 1
282 pause_mid_time = (

IndexError: index 0 is out of bounds for axis 0 with size 0

nwb file initiatialized a wrong size. could not write file. nwb builder is fine, but cannot write into nwb.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<timed exec> in <module>

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    228             process_mda_invalid_time=process_mda_invalid_time,
    229             process_pos_valid_time=process_pos_valid_time,
--> 230             process_pos_invalid_time=process_pos_invalid_time)
    231         logger.info('Done...\n')
    232 

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    243             nwb_builder = self.get_nwb_builder(date)
    244             content = nwb_builder.build()
--> 245             nwb_builder.write(content)
    246             if self.is_old_dataset:
    247                 logger.info('(old dataset: skipping append_to_nwb)')

~/Documents/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in write(self, content)
    430         logger.info('Writing down content to ' + self.output_file)
    431         with NWBHDF5IO(path=self.output_file, mode='w') as nwb_fileIO:
--> 432             nwb_fileIO.write(content)
    433             nwb_fileIO.close()
    434 

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in write(self, **kwargs)
    405 
    406         cache_spec = popargs('cache_spec', kwargs)
--> 407         call_docval_func(super().write, kwargs)
    408         if cache_spec:
    409             self.__cache_spec()

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in call_docval_func(func, kwargs)
    422 def call_docval_func(func, kwargs):
    423     fargs, fkwargs = fmt_docval_args(func, kwargs)
--> 424     return func(*fargs, **fkwargs)
    425 
    426 

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/io.py in write(self, **kwargs)
     48         container = popargs('container', kwargs)
     49         f_builder = self.__manager.build(container, source=self.__source, root=True)
---> 50         self.write_builder(f_builder, **kwargs)
     51 
     52     @docval({'name': 'src_io', 'type': 'HDMFIO', 'doc': 'the HDMFIO object for reading the data to export'},

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in write_builder(self, **kwargs)
    804                           % (f_builder.name, self.source, kwargs))
    805         for name, gbldr in f_builder.groups.items():
--> 806             self.write_group(self.__file, gbldr, **kwargs)
    807         for name, dbldr in f_builder.datasets.items():
    808             self.write_dataset(self.__file, dbldr, **kwargs)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in write_group(self, **kwargs)
    993             for subgroup_name, sub_builder in subgroups.items():
    994                 # do not create an empty group without attributes or links
--> 995                 self.write_group(group, sub_builder, **kwargs)
    996         # write all datasets
    997         datasets = builder.datasets

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in write_group(self, **kwargs)
    993             for subgroup_name, sub_builder in subgroups.items():
    994                 # do not create an empty group without attributes or links
--> 995                 self.write_group(group, sub_builder, **kwargs)
    996         # write all datasets
    997         datasets = builder.datasets

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in write_group(self, **kwargs)
    993             for subgroup_name, sub_builder in subgroups.items():
    994                 # do not create an empty group without attributes or links
--> 995                 self.write_group(group, sub_builder, **kwargs)
    996         # write all datasets
    997         datasets = builder.datasets

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in write_group(self, **kwargs)
    998         if datasets:
    999             for dset_name, sub_builder in datasets.items():
-> 1000                 self.write_dataset(group, sub_builder, **kwargs)
   1001         # write all links
   1002         links = builder.links

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    581             def func_call(*args, **kwargs):
    582                 pargs = _check_args(args, kwargs)
--> 583                 return func(args[0], **pargs)
    584         else:
    585             def func_call(*args, **kwargs):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in write_dataset(self, **kwargs)
   1294         self.__set_written(builder)
   1295         if exhaust_dci:
-> 1296             self.__exhaust_dcis()
   1297 
   1298     @classmethod

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in __exhaust_dcis(self)
    845             self.logger.debug("Exhausting DataChunkIterator from queue (length %d)" % len(self.__dci_queue))
    846             dset, data = self.__dci_queue.popleft()
--> 847             if self.__write_chunk__(dset, data):
    848                 self.__dci_queue.append((dset, data))
    849 

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/hdmf/backends/hdf5/h5tools.py in __write_chunk__(cls, dset, data)
   1391         dset.id.extend(max_bounds)
   1392         # Write the data
-> 1393         dset[chunk_i.selection] = chunk_i.data
   1394 
   1395         return True

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/h5py/_hl/dataset.py in __setitem__(self, args, val)
    947 
    948         # Perform the write, with broadcasting
--> 949         mspace = h5s.create_simple(selection.expand_shape(mshape))
    950         for fspace in selection.broadcast(mshape):
    951             self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/h5py/_hl/selections.py in expand_shape(self, source_shape)
    262                     eshape.append(t)
    263                 else:
--> 264                     raise TypeError("Can't broadcast %s -> %s" % (source_shape, self.array_shape))  # array shape
    265 
    266         if any([n > 1 for n in remaining_src_dims]):

TypeError: Can't broadcast (112007,) -> (112008,)

No such file or directory: 'trodesexport': 'trodesexport'

When generating an nwb file on breeze, I got the following error. This is a new error from the last time I generated nwb files a few months ago. Thanks in advance for any help with this.

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/rec_to_binaries/trodes_data.py in get_trodes_version_from_path()
   1498     try:
-> 1499         result = str(subprocess.run(['exportmda', '-v'], capture_output=True)
   1500                      .stdout)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    487 
--> 488     with Popen(*popenargs, **kwargs) as process:
    489         try:

~/anaconda3/envs/rec_to_nwb/lib/python3.7/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
    799                                 errread, errwrite,
--> 800                                 restore_signals, start_new_session)
    801         except:

~/anaconda3/envs/rec_to_nwb/lib/python3.7/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1550                             err_msg += ': ' + repr(err_filename)
-> 1551                     raise child_exception_type(errno_num, err_msg, err_filename)
   1552                 raise child_exception_type(err_msg)

FileNotFoundError: [Errno 2] No such file or directory: 'exportmda': 'exportmda'

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
/tmp/ipykernel_3703344/3730022203.py in <module>
     32                               output_path=output_path,
     33                               video_path=video_path,
---> 34                               trodes_rec_export_args=trodes_rec_export_args)
     35 
     36     content = builder.build_nwb()

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __init__(self, data_path, animal_name, dates, nwb_metadata, output_path, video_path, preprocessing_path, extract_analog, extract_spikes, extract_lfps, extract_dio, extract_mda, overwrite, lfp_export_args, mda_export_args, analog_export_args, dio_export_args, time_export_args, spikes_export_args, parallel_instances, trodes_rec_export_args)
     92         validation_registrator.validate()
     93 
---> 94         trodes_version = get_trodes_version_from_path()[0]
     95 
     96         if lfp_export_args is None:

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/rec_to_binaries/trodes_data.py in get_trodes_version_from_path()
   1500                      .stdout)
   1501     except FileNotFoundError:
-> 1502         result = str(subprocess.run(['trodesexport', '-v'], capture_output=True)
   1503                      .stdout)
   1504     version = (result

~/anaconda3/envs/rec_to_nwb/lib/python3.7/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    486         kwargs['stderr'] = PIPE
    487 
--> 488     with Popen(*popenargs, **kwargs) as process:
    489         try:
    490             stdout, stderr = process.communicate(input, timeout=timeout)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
    798                                 c2pread, c2pwrite,
    799                                 errread, errwrite,
--> 800                                 restore_signals, start_new_session)
    801         except:
    802             # Cleanup if the child failed starting.

~/anaconda3/envs/rec_to_nwb/lib/python3.7/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1549                         if errno_num == errno.ENOENT:
   1550                             err_msg += ': ' + repr(err_filename)
-> 1551                     raise child_exception_type(errno_num, err_msg, err_filename)
   1552                 raise child_exception_type(err_msg)
   1553 

FileNotFoundError: [Errno 2] No such file or directory: 'trodesexport': 'trodesexport'

# DIOs and camera frames unalignment

I am getting an error following. Thank you!

Checking associated file /stelmo/kyu/L5/raw/20230403/20230403_L5_01_r1.stateScriptLog
Checking associated file /stelmo/kyu/L5/raw/20230403/20230403_L5_02_r2.stateScriptLog

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
~/source/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
     53                 position_df = self.get_position_with_corrected_timestamps(
---> 54                     position_tracking_path[0]
     55                 )

~/source/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_position_with_corrected_timestamps(position_tracking_path)
    137         # half second pause at the start to allow for alignment.
--> 138         pause_mid_time = find_acquisition_timing_pause(dio_systime)
    139 

~/source/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in find_acquisition_timing_pause(timestamps, min_duration, max_duration, n_search)
    391     )
--> 392     pause_start_ind = np.nonzero(is_valid_gap)[0][0]
    393     pause_end_ind = pause_start_ind + 1

IndexError: index 0 is out of bounds for axis 0 with size 0

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
<timed exec> in <module>

~/source/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    230             process_mda_invalid_time=process_mda_invalid_time,
    231             process_pos_valid_time=process_pos_valid_time,
--> 232             process_pos_invalid_time=process_pos_invalid_time)
    233 
    234         logger.info('Done...\n')

~/source/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    245             logger.info('Date: {}'.format(date))
    246             nwb_builder = self.get_nwb_builder(date)
--> 247             content = nwb_builder.build()
    248             nwb_builder.write(content)
    249             if self.is_old_dataset:

~/source/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
    369             self.associated_files_originator.make(nwb_content)
    370 
--> 371         self.position_originator.make(nwb_content)
    372 
    373         valid_map_dict = self.__build_corrupted_data_manager()

~/source/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/source/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
     68                 )
     69                 video_df = self.get_corrected_timestamps_without_position(
---> 70                     video_file_path[0]
     71                 )
     72                 position.create_spatial_series(

~/source/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_corrected_timestamps_without_position(hw_frame_count_path)
    255         # The DIOs and camera frames are initially unaligned. There is a
    256         # half second pause at the start to allow for alignment.
--> 257         pause_mid_time = find_acquisition_timing_pause(dio_systime)
    258 
    259         if ptp_enabled:

~/source/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in find_acquisition_timing_pause(timestamps, min_duration, max_duration, n_search)
    390         timestamp_difference < max_duration
    391     )
--> 392     pause_start_ind = np.nonzero(is_valid_gap)[0][0]
    393     pause_end_ind = pause_start_ind + 1
    394     pause_mid_time = (

IndexError: index 0 is out of bounds for axis 0 with size 0

Timestamps for old dataset

NWB files from old datasets (without PTP) currently only contain camera frame counts, but not the actual timestamps. Need to implement the conversion.

PTP seems to not be used when available

I git pulled the most recent version of rec to nwb this morning, and generated a nwb file for a single-epoch test recording that used PTP (J1620230614_.nwb). The .pos directory has the following files:
20230614_J16_12_r6.1.pos_cameraHWFrameCount.dat
20230614_J16_12_r6.1.pos_timestamps.dat
20230614_J16_12_r6.1.pos_online.dat

I am under the impression that if PTP is being used, there should be a cameraHWSync file in place of the cameraHWFrameCount file. @edeno, would you be able to comment on this? Thanks in advance.

dataset_names determined in two different ways

It appears that dataset_names is determined in two slightly different ways, but then the one way is used to access the other. Most of the time the two ways yield the same result, but when they do not, a bug arises.

nwb_file_builder.py's __extract_datasets() calls self.data_scanner.extract_data_from_date_folder(), which goes on to use __extract_experiments() which then uses __extract_datasets() in data_scanner.py. That final __extract_datasets() determines dataset_names with one approach. Then, separately, in nwb_file_builder.py, dataset_names is determined with another approach, using self.data_scanner.get_all_epochs(). It seems like these two spots should use the same rather than two slightly different approaches to determine dataset_names. That is because later, in nwb_file_builder.py, we do the following list comprehension:
[self.data_scanner.data[animal_name][date][dataset] for dataset in self.dataset_name]
If the two approaches have yielded distinct dataset_names, then the dataset key will not have any value.

The two approaches seem to yield distinct dataset_names when a filename has an optional field following the sleep/run descriptor. For example, instead of date_animal_epochnumber_description.fileending, if the filename is date_animal_epochnumber_description_optionaltag.fileending, then the dataset_names may have either the form epochnumber_optionaltag or epochnumber_description with the two appraoches, respectively. Instead, we want both methods to yield a string of the form epochnumber_description.

I can rename files to avoid having this optionaltag at the end of the filename if needed, but it would be useful to be able to add optional info into the filename, while still satisfying our documentation's naming conventions (per regular expression: ^(\d*)_([a-zA-Z0-9]*)_(\d*)_{0,1}(\w*)\.{0,1}(.*)\.([a-zA-Z0-9]*)$). Moreover, it is still odd for us to determine dataset_names in two different ways and then use them as if they are the exact same. Is there a reason it is the way it is that I am missing? Would it be okay to change it so that both get_all_epochs() and __extract_datasets() determine the dataset_names the same way? I can work on PR if someone who is more familiar with this code thinks it seems okay, but I am not super familiar with this code so wanted to see what folks think first. Thanks much for any thoughts.

Example error -

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_3397871/450084155.py in <module>
     37                               video_path=video_path,
     38                               trodes_rec_export_args = trodes_rec_export_args)
---> 39         content = builder.build_nwb()
     40         print(content) #just to look
     41         # Automatically delete preprocessing files? Not for this guy given old pipeline use still

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    224             process_mda_invalid_time=process_mda_invalid_time,
    225             process_pos_valid_time=process_pos_valid_time,
--> 226             process_pos_invalid_time=process_pos_invalid_time)
    227         logger.info('Done...\n')
    228 

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    237         for date in self.dates:
    238             logger.info('Date: {}'.format(date))
--> 239             nwb_builder = self.get_nwb_builder(date)
    240             content = nwb_builder.build()
    241             nwb_builder.write(content)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in get_nwb_builder(self, date)
    273             reconfig_header=self.__get_header_path(),
    274             # reconfig_header=self.__is_rec_config_valid()
--> 275             **old_dataset_kwargs
    276         )
    277 

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in __init__(self, data_path, animal_name, date, nwb_metadata, process_dio, process_mda, process_analog, process_pos_timestamps, preprocessing_path, video_path, output_file, reconfig_header, is_old_dataset, session_start_time)
    218         validation_registrator.validate()
    219 
--> 220         self.__extract_datasets(animal_name, date)
    221 
    222         self.corrupted_data_manager = CorruptedDataManager(self.metadata)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in __extract_datasets(self, animal_name, date)
    339         self.data_scanner.extract_data_from_date_folder(date)
    340         self.datasets = [self.data_scanner.data[animal_name]
--> 341                          [date][dataset] for dataset in self.dataset_names]
    342 
    343     def build(self):

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in <listcomp>(.0)
    339         self.data_scanner.extract_data_from_date_folder(date)
    340         self.datasets = [self.data_scanner.data[animal_name]
--> 341                          [date][dataset] for dataset in self.dataset_names]
    342 
    343     def build(self):

KeyError: '01_sleep'

And then if we take a look here, self.data_scanner.data[animal_name][date] evaluates to

{'sleep_p1': <rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2762e27490>, 'sleep_p2': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2762e27990>, 'sleep_p3': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2762e27f90>, 'sleep_p4': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2762e27610>, 'sleep_p5': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2762f3f790>, 'lineartrack_p1': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f276471dc10>, 'lineartrack_p2': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2763a846d0>, 'lineartrack_p3': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2763a84c50>, 'lineartrack_p4': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2763a84390>, 'lineartrack_p5': 
<rec_to_nwb.processing.tools.dataset.Dataset object at 0x7f2762f3e510>}

So it is looking for '01_sleep' correctly, but the thing it is looking for is currently incorrectly called 'sleep_p1'

key error with offline position tracking

KeyError Traceback (most recent call last)
/tmp/ipykernel_3135872/2747406067.py in
20 overwrite = False)
21
---> 22 builder.build_nwb()
23

~/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
230 process_mda_invalid_time=process_mda_invalid_time,
231 process_pos_valid_time=process_pos_valid_time,
--> 232 process_pos_invalid_time=process_pos_invalid_time)
233
234 logger.info('Done...\n')

~/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
245 logger.info('Date: {}'.format(date))
246 nwb_builder = self.get_nwb_builder(date)
--> 247 content = nwb_builder.build()
248 nwb_builder.write(content)
249 if self.is_old_dataset:

~/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
369 self.associated_files_originator.make(nwb_content)
370
--> 371 self.position_originator.make(nwb_content)
372
373 valid_map_dict = self.__build_corrupted_data_manager()

~/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
50 zip(meters_per_pixels, position_tracking_paths)):
51 position_df = self.get_position_with_corrected_timestamps(
---> 52 position_tracking_path)
53 position.create_spatial_series(
54 name=f'series_{series_id}',

~/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_position_with_corrected_timestamps(position_tracking_path)
101 # Additionally, for offline tracking, frames can be skipped if the
102 # frame is labeled as bad.
--> 103 video_info = video_info.loc[position_tracking.index.unique()]
104 frame_count = np.asarray(video_info.HWframeCount)
105 ptp_systime = np.asarray(video_info.HWTimestamp)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis)
1372 if use_interval_msg:
1373 key = list(key)
-> 1374 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
1375
1376 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())

KeyError: "None of [UInt64Index([ 15318149, 15319121, 15320092, 15321063, 15322035, 15323006,\n 15323978, 15340762, 15341733, 15342704,\n ...\n 127631504, 127632472, 127633448, 127634416, 127635392, 127636360,\n 127637328, 127638304, 127639272, 127640248],\n dtype='uint64', name='time', length=115617)] are in the [index]"

data directory: '/cumulus/david/Scn2a/coh5/CH65/20211125/preprocessing'

Pandas .append is deprecated in new versions

Environment needs to restrict version to 1.4.0

Attribute error when getting unique timestamps

I ran into this error after pulling the latest commit that looks for unique pos timestamps. Please see debugging and traceback below. Looks like you might be trying to do a data conversion here as well @edeno so I'm not sure what your preferred fix is. Alternatively, if this did not arise in others' testing and is an issue on my end, any thoughts on what might be going on are greatly appreciated. Thanks much.

Error: 'numpy.ndarray' object has no attribute 'to_numpy'

Debugging:

/home/alison/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/position/pos_timestamp_manager.py(35)_get_timestamps()
33 self.directories[dataset_id][0])
34 position = pd.DataFrame(pos_online['data'])
---> 35 return position.time.unique().to_numpy(dtype='int64')
36
37 def retrieve_real_timestamps(self, dataset_id):

ipdb> position.time.unique()

array([37429635, 37430514, 37431434, ..., 91520405, 91521307, 91522500],
dtype=uint32)

ipdb> position.time.unique().to_numpy()

*** AttributeError: 'numpy.ndarray' object has no attribute 'to_numpy'

Traceback:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_3965895/3392245823.py in <module>
     36                               video_path=video_path,
     37                               trodes_rec_export_args = trodes_rec_export_args)
---> 38         content = builder.build_nwb()
     39         print(content) #just to look
     40         # Automatically delete preprocessing files? Not for this guy given old pipeline use still

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    224             process_mda_invalid_time=process_mda_invalid_time,
    225             process_pos_valid_time=process_pos_valid_time,
--> 226             process_pos_invalid_time=process_pos_invalid_time)
    227         logger.info('Done...\n')
    228 

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    238             logger.info('Date: {}'.format(date))
    239             nwb_builder = self.get_nwb_builder(date)
--> 240             content = nwb_builder.build()
    241             nwb_builder.write(content)
    242             if self.is_old_dataset:

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
    375             self.associated_files_originator.make(nwb_content)
    376 
--> 377         self.position_originator.make(nwb_content)
    378 
    379         valid_map_dict = self.__build_corrupted_data_manager()

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
     34     def make(self, nwb_content: NWBFile):
     35         logger.info('Position: Building')
---> 36         fl_positions = self.fl_position_manager.get_fl_positions()
     37         logger.info('Position: Creating')
     38         position = self.position_creator.create_all(fl_positions)

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/position/fl_position_manager.py in get_fl_positions(self)
     35         columns_labels = self.fl_position_extractor.get_columns_labels()
     36         if self.process_timestamps:
---> 37             timestamps = self.fl_position_extractor.get_timestamps()
     38 
     39             validate_parameters_equal_length(

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/position/fl_position_extractor.py in get_timestamps(self)
     70             for position_directory, continuous_time_directory in zip(
     71                 self.all_position_directories,
---> 72                 self.continuous_time_directories)
     73         ]

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/position/fl_position_extractor.py in <listcomp>(.0)
     68                     convert_timestamps=self.convert_timestamps
     69                 ))
---> 70             for position_directory, continuous_time_directory in zip(
     71                 self.all_position_directories,
     72                 self.continuous_time_directories)

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/position/pos_timestamp_manager.py in __init__(self, directories, continuous_time_directories, convert_timestamps)
     24                  convert_timestamps=True):
     25         TimestampManager.__init__(
---> 26             self, directories, continuous_time_directories)
     27         self.convert_timestamps = convert_timestamps
     28 

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/common/timestamps_manager.py in __init__(self, directories, continuous_time_directories)
     25 
     26         self.number_of_datasets = self._get_number_of_datasets()
---> 27         self.file_lengths_in_datasets = self.__calculate_file_lengths_in_datasets()
     28 
     29     @abc.abstractmethod

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/common/timestamps_manager.py in __calculate_file_lengths_in_datasets(self)
     52 
     53     def __calculate_file_lengths_in_datasets(self):
---> 54         return [self._get_data_shape(i) for i in range(self.number_of_datasets)]
     55 
     56     def _get_number_of_datasets(self):

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/common/timestamps_manager.py in <listcomp>(.0)
     52 
     53     def __calculate_file_lengths_in_datasets(self):
---> 54         return [self._get_data_shape(i) for i in range(self.number_of_datasets)]
     55 
     56     def _get_number_of_datasets(self):

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/common/timestamps_manager.py in _get_data_shape(self, dataset_num)
     58 
     59     def _get_data_shape(self, dataset_num):
---> 60         return np.shape(self.read_timestamps_ids(dataset_num))[0]

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/common/timestamps_manager.py in read_timestamps_ids(self, dataset_id)
     40 
     41     def read_timestamps_ids(self, dataset_id):
---> 42         return self._get_timestamps(dataset_id)
     43 
     44     def get_final_data_shape(self):

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/position/pos_timestamp_manager.py in _get_timestamps(self, dataset_id)
     33             self.directories[dataset_id][0])
     34         position = pd.DataFrame(pos_online['data'])
---> 35         return position.time.unique().to_numpy(dtype='int64')
     36 
     37     def retrieve_real_timestamps(self, dataset_id):

AttributeError: 'numpy.ndarray' object has no attribute 'to_numpy'

FileNotFoundError: need either cameraHWSync or cameraHWFrameCount files.

Sorry, I ran into another error while processing franklab_nwb_generation.
Could you help me out with this?

I have 'cameraHWSync' files in data path but not cameraHWFrameCount files.
My trodes_rec_export_args is below;

trodes_rec_export_args = ('-reconfig','/home/yshwang/yaml/Livermore_D2_128ch_4s_6mm_15um_Spikegadgets_2164X2_H70_NWB_conversion.xml') 


---------------------------------------------------------------------------

builder = RawToNWBBuilder(animal_name=animal_name,
                          data_path=data_path,
                          dates=dates,
                          nwb_metadata=metadata,
                          overwrite=overwrite,
                          output_path=output_path,
                          video_path=video_path,
                          trodes_rec_export_args = trodes_rec_export_args)

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/tmp/ipykernel_3737993/2486487710.py in <module>
      6                           output_path=output_path,
      7                           video_path=video_path,
----> 8                           trodes_rec_export_args = trodes_rec_export_args)

~/source/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/source/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __init__(self, data_path, animal_name, dates, nwb_metadata, output_path, video_path, preprocessing_path, extract_analog, extract_spikes, extract_lfps, extract_dio, extract_mda, overwrite, lfp_export_args, mda_export_args, analog_export_args, dio_export_args, time_export_args, spikes_export_args, parallel_instances, trodes_rec_export_args)
    158         self.trodes_rec_export_args = trodes_rec_export_args
    159 
--> 160         self.is_old_dataset = self.__is_old_dataset()
    161 
    162     def __repr__(self):

~/source/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __is_old_dataset(self)
    201             return True
    202         raise FileNotFoundError(
--> 203             'need either cameraHWSync or cameraHWFrameCount files.')
    204 
    205     def build_nwb(self, run_preprocessing=True,

FileNotFoundError: need either cameraHWSync or cameraHWFrameCount files.

error at write() for datasets with mixed electrode types

Context

Dataset contains recordings from different electrode (ntrode) types, with different number of channels in each shank.

For example, KF2 dataset has a mix of tetrodes (1 shank with 4 channels) and 32-channel probes (2 shanks with 16 channels each).

The preprocessed binary file *.nt[i].mda contain data from the i-th ntrode, where the number of rows correspond to the number of channels.

Expected behavior

MultiThreadDataIterator should detect the correct fie size, to determine the target index range in the NWB file that the data chunk should occupy.

Current behavior

MultiThreadDataIterator creates a DataChunk with mismatching data size and selection size.

This is because in DataManager, the number of rows per file is treated as a fixed value for all files (determined from the first file in the directory).

Error traceback

After building the NWB content successfully, NWBFileBuilder.write() fails:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-0e26a06ac520> in <module>
      1 # build successful; now write NWB file
----> 2 nwb_builder.write(nwb_content)

~/proj/rec_to_nwb/rec_to_nwb/processing/builder/old_nwb_file_builder.py in write(self, content)
    334         logger.info('Writing down content to ' + self.output_file)
    335         with NWBHDF5IO(path=self.output_file, mode='w') as nwb_fileIO:
--> 336             nwb_fileIO.write(content)
    337             nwb_fileIO.close()
    338 

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    559             def func_call(*args, **kwargs):
    560                 pargs = _check_args(args, kwargs)
--> 561                 return func(args[0], **pargs)
    562         else:
    563             def func_call(*args, **kwargs):

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/hdf5/h5tools.py in write(self, **kwargs)
    321 
    322         cache_spec = popargs('cache_spec', kwargs)
--> 323         call_docval_func(super().write, kwargs)
    324         if cache_spec:
    325             self.__cache_spec()

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/utils.py in call_docval_func(func, kwargs)
    403 def call_docval_func(func, kwargs):
    404     fargs, fkwargs = fmt_docval_args(func, kwargs)
--> 405     return func(*fargs, **fkwargs)
    406 
    407 

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    559             def func_call(*args, **kwargs):
    560                 pargs = _check_args(args, kwargs)
--> 561                 return func(args[0], **pargs)
    562         else:
    563             def func_call(*args, **kwargs):

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/io.py in write(self, **kwargs)
     43         container = popargs('container', kwargs)
     44         f_builder = self.__manager.build(container, source=self.__source)
---> 45         self.write_builder(f_builder, **kwargs)
     46 
     47     @docval({'name': 'src_io', 'type': 'HDMFIO', 'doc': 'the HDMFIO object for reading the data to export'},

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    559             def func_call(*args, **kwargs):
    560                 pargs = _check_args(args, kwargs)
--> 561                 return func(args[0], **pargs)
    562         else:
    563             def func_call(*args, **kwargs):

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/hdf5/h5tools.py in write_builder(self, **kwargs)
    714                           % (f_builder.name, self.source, kwargs))
    715         for name, gbldr in f_builder.groups.items():
--> 716             self.write_group(self.__file, gbldr, **kwargs)
    717         for name, dbldr in f_builder.datasets.items():
    718             self.write_dataset(self.__file, dbldr, **kwargs)

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    559             def func_call(*args, **kwargs):
    560                 pargs = _check_args(args, kwargs)
--> 561                 return func(args[0], **pargs)
    562         else:
    563             def func_call(*args, **kwargs):

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/hdf5/h5tools.py in write_group(self, **kwargs)
    894             for subgroup_name, sub_builder in subgroups.items():
    895                 # do not create an empty group without attributes or links
--> 896                 self.write_group(group, sub_builder, **kwargs)
    897         # write all datasets
    898         datasets = builder.datasets

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    559             def func_call(*args, **kwargs):
    560                 pargs = _check_args(args, kwargs)
--> 561                 return func(args[0], **pargs)
    562         else:
    563             def func_call(*args, **kwargs):

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/hdf5/h5tools.py in write_group(self, **kwargs)
    899         if datasets:
    900             for dset_name, sub_builder in datasets.items():
--> 901                 self.write_dataset(group, sub_builder, **kwargs)
    902         # write all links
    903         links = builder.links

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/utils.py in func_call(*args, **kwargs)
    559             def func_call(*args, **kwargs):
    560                 pargs = _check_args(args, kwargs)
--> 561                 return func(args[0], **pargs)
    562         else:
    563             def func_call(*args, **kwargs):

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/hdf5/h5tools.py in write_dataset(self, **kwargs)
   1167         self.__set_written(builder)
   1168         if exhaust_dci:
-> 1169             self.__exhaust_dcis()
   1170 
   1171     @classmethod

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/hdf5/h5tools.py in __exhaust_dcis(self)
    755             self.logger.debug("Exhausting DataChunkIterator from queue (length %d)" % len(self.__dci_queue))
    756             dset, data = self.__dci_queue.popleft()
--> 757             if self.__write_chunk__(dset, data):
    758                 self.__dci_queue.append((dset, data))
    759 

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/hdmf/backends/hdf5/h5tools.py in __write_chunk__(cls, dset, data)
   1264         dset.id.extend(max_bounds)
   1265         # Write the data
-> 1266         dset[chunk_i.selection] = chunk_i.data
   1267 
   1268         return True

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/h5py/_hl/dataset.py in __setitem__(self, args, val)
    705             mshape_pad = mshape
    706         mspace = h5s.create_simple(mshape_pad, (h5s.UNLIMITED,)*len(mshape_pad))
--> 707         for fspace in selection.broadcast(mshape):
    708             self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl)
    709 

~/opt/anaconda3/envs/nwb/lib/python3.6/site-packages/h5py/_hl/selections.py in broadcast(self, target_shape)
    297                     tshape.append(t)
    298                 else:
--> 299                     raise TypeError("Can't broadcast %s -> %s" % (target_shape, self.mshape))
    300 
    301         if any([n > 1 for n in target]):

TypeError: Can't broadcast (53057768, 96) -> (53057768, 24)

Loss of precision when converting ephys data from rec to nwb

Currently, rec_to_nwb does the following:

calls rec_to_binaries which converts the raw ephys voltage data from .rec to .mda format (dtype = int16; ADC units)
parses the "raw_data_to_volts" key from the metadata YAML. according to a jan 2021 slack message from loren, this value should always be set to 0.000000195 (or 1.95e-7)
multiplies the above value by 1e6 to get the conversion factor from raw to uV (0.195). this matches the value stored in the .rec xml file headers (rawScalingToUv="0.19500000000000001")
multiplies the raw int16 data (in ADC units) from the .mda file by the above value (0.195) and then sets the dtype to int16, which truncates any values after the decimal point (0.99 -> 0)
writes this transformed raw data (now in uV) to an NWB ElectricalSeries object named "e-series" with a 1e-6 conversion factor, used to convert the data to volts

Because of the data transformation in Step 4 above, there is a loss of precision. Let's say the original .rec file data has values:

>>> np.arange(10, dtype="int16")
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int16)

then after multiplying by 0.195 to get the data in uV, the values are:

>>> np.arange(10, dtype="int16") * 0.195
array([0.   , 0.195, 0.39 , 0.585, 0.78 , 0.975, 1.17 , 1.365, 1.56 ,
       1.755])

then after setting the dtype to int16, the values are:

>>> (np.arange(10, dtype="int16") * 0.195).astype("int16")
array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1], dtype=int16)

Note the loss of precision in the resulting output. If the original data has unique values -100 to 100 (201 possible), then the converted NWB file will have unique values -19 to 19 (39 possible). This could have an impact on spike sorting and LFP filtering - probably a small impact, but still I think some impact?

For the above reason, it is more common to store the raw, untransformed int16 ephys data (ADC units) from an acquisition system as the ElectricalSeries data, and store the conversion factor (here: 0.000000195). However, NWB users (such as Spyglass) have to remember to multiple the data by the conversion factor to get the data in volts. (The NWB team is working on improving this messaging...). Note that this makes using the data just a little slower and converting the data just a little faster.

I suggest that the ephys data be stored in the original ADC units, because currently some precision is lost, and the cost of multiplying during use is small.

Error during copying video file

I am getting an error (see below) during nwb generation that seems to be permissions-related at the stage of copying over video files. After trying to investigate with the debugger, the src and dest files and containing directories all have rwx permissions for everyone. Any thoughts greatly appreciated! Thanks in advance.

In case useful, here are a few of the variables from the end of the traceback -
st: os.stat_result(st_mode=33279, st_ino=25182299, st_dev=59, st_nlink=1, st_uid=39495, st_gid=39495, st_size=107605743, st_atime=1679965970, st_mtime=1645842377, st_ctime=1666115234)

dst: '/cumulus/amankili/Banner/nwbout/video/20220225_Banner_01_sleepBan77mWnostim.1.h264'

src: '/cumulus/amankili/Banner/raw/20220225/20220225_Banner_01_sleepBan77mWnostim.1.h264'

self.video_files_to_copy:
['20220225_Banner_01_sleepBan77mWnostim.1.h264', '20220225_Banner_03_sleepBan77mWnostim.1.h264', '20220225_Banner_05_sleepBan77mWnostim.1.h264', '20220225_Banner_07_sleepBan77mWnostim.1.h264', '20220225_Banner_09_sleepBan77mWnostim.1.h264', '20220225_Banner_11_sleepBan77mWnostim.1.h264', '20220225_Banner_13_sleepBan77mWnostim.1.h264', '20220225_Banner_15_sleepBan77mWnostim.1.h264', '20220225_Banner_17_sleepBan77mWnostim.1.h264', '20220225_Banner_02_wtrackBan77mWlockout80mstheta90.1.h264', '20220225_Banner_04_wtrackBan77mWlockout80mstheta90.1.h264', '20220225_Banner_06_wtrackBan77mWlockout80mstheta90.1.h264', '20220225_Banner_08_wtrackBan77mWlockout80mstheta90.1.h264', '20220225_Banner_10_wtrackBan77mWlockout80mstheta90.1.h264', '20220225_Banner_12_wtrackBan77mWlockout80mstheta90.1.h264', '20220225_Banner_14_wtrackBan77mWlockout80mstheta90.1.h264', '20220225_Banner_16_wtrackBan77mWlockout80mstheta90.1.h264']

---------------------------------------------------------------------------
PermissionError                           Traceback (most recent call last)
/tmp/ipykernel_3511942/1858210567.py in <module>
     37                               video_path=video_path,
     38                               trodes_rec_export_args = trodes_rec_export_args)
---> 39         content = builder.build_nwb()
     40         print(content) #just to look
     41         # Automatically delete preprocessing files? Not for this guy given old pipeline use still

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    230             process_mda_invalid_time=process_mda_invalid_time,
    231             process_pos_valid_time=process_pos_valid_time,
--> 232             process_pos_invalid_time=process_pos_invalid_time)
    233 
    234         logger.info('Done...\n')

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    244         for date in self.dates:
    245             logger.info('Date: {}'.format(date))
--> 246             nwb_builder = self.get_nwb_builder(date)
    247             content = nwb_builder.build()
    248             nwb_builder.write(content)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in get_nwb_builder(self, date)
    303             reconfig_header=self.__get_header_path(),
    304             # reconfig_header=self.__is_rec_config_valid()
--> 305             **old_dataset_kwargs
    306         )
    307 

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in __init__(self, data_path, animal_name, date, nwb_metadata, process_dio, process_mda, process_analog, process_pos_timestamps, preprocessing_path, video_path, output_file, reconfig_header, is_old_dataset, session_start_time)
    294                              'raw', self.date),
    295                 self.video_path,
--> 296                 self.metadata["associated_video_files"],
    297             )
    298 

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/video_files_originator.py in __init__(self, raw_data_path, video_path, video_files_metadata, convert_timestamps, return_timestamps)
     16             raw_data_path, video_path, video_files_metadata,
     17             convert_timestamps=convert_timestamps,
---> 18             return_timestamps=return_timestamps)
     19 
     20     def make(self, nwb_content):

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/video_files/fl_video_files_manager.py in __init__(self, raw_data_path, video_path, video_files_metadata, convert_timestamps, return_timestamps)
     21         self.video_files_copy_maker = VideoFilesCopyMaker(
     22             [video_files['name'] for video_files in video_files_metadata])
---> 23         self.video_files_copy_maker.copy(raw_data_path, video_path)
     24         self.fl_video_files_extractor = FlVideoFilesExtractor(
     25             raw_data_path, video_files_metadata,

~/Src/rec_to_nwb/rec_to_nwb/processing/nwb/components/video_files/video_files_copy_maker.py in copy(self, src, dst)
     15             raise InvalidPathException(dst + ' is not valid path')
     16         for video_file in self.video_files_to_copy:
---> 17             copy_file(os.path.join(src,  video_file), dst)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/shutil.py in copy(src, dst, follow_symlinks)
    247         dst = os.path.join(dst, os.path.basename(src))
    248     copyfile(src, dst, follow_symlinks=follow_symlinks)
--> 249     copymode(src, dst, follow_symlinks=follow_symlinks)
    250     return dst
    251 

~/anaconda3/envs/rec_to_nwb/lib/python3.7/shutil.py in copymode(src, dst, follow_symlinks)
    142 
    143     st = stat_func(src)
--> 144     chmod_func(dst, stat.S_IMODE(st.st_mode))
    145 
    146 if hasattr(os, 'listxattr'):

PermissionError: [Errno 1] Operation not permitted: '/cumulus/amankili/Banner/nwbout/video/20220225_Banner_01_sleepBan77mWnostim.1.h264'

Date of birth required by nwb_builder

The date_of_birth key-value pair in the metadata isn't required in the metadata extractor but is in the nwb builder. Conversion from .rec files to .nwb files fail at this step if .yml subject metadata doesn't contain a date_of_birth key-value pair.

Duplicate timestamps in pos_online file seems to be a problem for generating nwb file

I get a dimension mismatch error when trying to generate an nwb file with a recording that had a disconnect (J1620210609_.nwb; recording 19_h2 has the disconnect).

I believe this error occurs when trying to index position_tracking (length 202873) with a boolean the same length as ptp_systime (length 76007) in position_originator.py:

            position_tracking = (
                position_tracking
                .iloc[ptp_systime > pause_mid_time]
                .set_index(ptp_timestamps))

It looks like position_tracking corresponds to 20210609_J16_19_h2.1.pos_online.dat, and ptp_systime has data from .pos_cameraHWFrameCount.dat (via video_info). On line 107 of position_originator, the unique indices of position_tracking are used to define ptp_systime (via video_info):

video_info = video_info.loc[position_tracking.index.unique()]

It looks like there are duplicate indices in position_tracking in the later half of the recording (perhaps related to the disconnect?), and so ptp_systime ends up shorter than position_tracking.

@lfrank or @edeno, do you have thoughts about how to address this? Thanks in advance.

Full traceback:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_378752/3133151736.py in <module>
     46                               trodes_rec_export_args=trodes_rec_export_args)
     47 
---> 48     content = builder.build_nwb()
     49     print(content)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    230             process_mda_invalid_time=process_mda_invalid_time,
    231             process_pos_valid_time=process_pos_valid_time,
--> 232             process_pos_invalid_time=process_pos_invalid_time)
    233 
    234         logger.info('Done...\n')

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    245             logger.info('Date: {}'.format(date))
    246             nwb_builder = self.get_nwb_builder(date)
--> 247             content = nwb_builder.build()
    248             nwb_builder.write(content)
    249             if self.is_old_dataset:

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
    369             self.associated_files_originator.make(nwb_content)
    370 
--> 371         self.position_originator.make(nwb_content)
    372 
    373         valid_map_dict = self.__build_corrupted_data_manager()

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
     50             first_timestamps = []
     51             for series_id, (conversion, position_tracking_path) in enumerate(
---> 52                     zip(meters_per_pixels, position_tracking_paths)):
     53                 position_df = self.get_position_with_corrected_timestamps(
     54                     position_tracking_path)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in get_position_with_corrected_timestamps(position_tracking_path)
    124             ptp_timestamps = pd.Index(
    125                 ptp_systime[ptp_systime > pause_mid_time] /
--> 126                 NANOSECONDS_PER_SECOND,
    127                 name='time')
    128             position_tracking = (

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
    929 
    930             maybe_callable = com.apply_if_callable(key, self.obj)
--> 931             return self._getitem_axis(maybe_callable, axis=axis)
    932 
    933     def _is_scalar_access(self, key: tuple):

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1551         if com.is_bool_indexer(key):
   1552             self._validate_key(key, axis)
-> 1553             return self._getbool_axis(key, axis=axis)
   1554 
   1555         # a list of integers

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in _getbool_axis(self, key, axis)
    946         # caller is responsible for ensuring non-None axis
    947         labels = self.obj._get_axis(axis)
--> 948         key = check_bool_indexer(labels, key)
    949         inds = key.nonzero()[0]
    950         return self.obj._take_with_is_copy(inds, axis=axis)

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexing.py in check_bool_indexer(index, key)
   2399         # key may contain nan elements, check_array_indexer needs bool array
   2400         result = pd_array(result, dtype=bool)
-> 2401     return check_array_indexer(index, result)
   2402 
   2403 

~/anaconda3/envs/rec_to_nwb/lib/python3.7/site-packages/pandas/core/indexers.py in check_array_indexer(array, indexer)
    560         if len(indexer) != len(array):
    561             raise IndexError(
--> 562                 f"Boolean index has wrong length: "
    563                 f"{len(indexer)} instead of {len(array)}"
    564             )

IndexError: Boolean index has wrong length: 76007 instead of 202873

epoch times not aligned in pos and epoch interval lists for recording with disconnect

The IntervalList table is automatically populated with interval lists for epochs (e.g. "02_r1") and position valid times (e.g. "pos 1 valid times") when insert_sessions is run on a nwb file. For peanut20201114_.nwb, "19_h2" (epoch interval list) starts about 18 minutes earlier than "pos 18 valid times" (position valid times interval list), even though these correspond to the same recording. In case relevant, these intervals are roughly the same length. Also, this recording had a disconnect. @edeno or @lfrank, I was wondering if either of you may have an idea for what the cause of this offset could be. I can look into the rec to nwb code if not. Thanks in advance for any thoughts.

position_originator thinks epoch timestamps out of order

I got the following error when generating an nwb file. It looks like the error arises because the position_originator thinks the epoch timestamps are out of order. I don't see anything obviously amiss in the yaml file used to generate this nwb file. The distances between epoch starts below seem reasonable up to the last one. I am aware @shijiegu also posted an error arising in the position_originator, although that error looked slightly different, so I'm posting mine here. Thanks in advance for any help with this.

ipdb> np.diff(first_timestamps)/60

array([ 54.81162518, 21.76272645, 50.42214757, 21.59982565,
53.52381701, 28.49434762, 36.55967264, 10.7667066 ,
22.31105215, 57.62421057, 28.51698989, 49.44335831,
22.06270885, 54.13967245, 22.25902193, 53.1217701 ,
106.73524576, 145.68806351, 151.27331273, 395.00253439,
-772.70757801])

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/tmp/ipykernel_3174520/3482281323.py in <module>
     46                               trodes_rec_export_args=trodes_rec_export_args)
     47 
---> 48     content = builder.build_nwb()
     49     print(content)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in build_nwb(self, run_preprocessing, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    224             process_mda_invalid_time=process_mda_invalid_time,
    225             process_pos_valid_time=process_pos_valid_time,
--> 226             process_pos_invalid_time=process_pos_invalid_time)
    227         logger.info('Done...\n')
    228 

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/raw_to_nwb_builder.py in __build_nwb_file(self, process_mda_valid_time, process_mda_invalid_time, process_pos_valid_time, process_pos_invalid_time)
    238             logger.info('Date: {}'.format(date))
    239             nwb_builder = self.get_nwb_builder(date)
--> 240             content = nwb_builder.build()
    241             nwb_builder.write(content)
    242             if self.is_old_dataset:

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/nwb_file_builder.py in build(self)
    369             self.associated_files_originator.make(nwb_content)
    370 
--> 371         self.position_originator.make(nwb_content)
    372 
    373         valid_map_dict = self.__build_corrupted_data_manager()

~/Src/rec_to_nwb/rec_to_nwb/processing/tools/beartype/beartype.py in func_beartyped(__beartype_func, *args, **kwargs)

~/Src/rec_to_nwb/rec_to_nwb/processing/builder/originators/position_originator.py in make(self, nwb_content)
     63         # check if timestamps are in order
     64         first_timestamps = np.asarray(first_timestamps)
---> 65         assert np.all(first_timestamps[:-1] < first_timestamps[1:])
     66 
     67         logger.info('Position: Injecting into Processing Module')

AssertionError:

Required for DANDI upload:

The subject sex should be "M", "F", "U" (unknown), or "O" (other)
The subject age or data of birth is required. For formatting, see the above documentation
The file path to the external video file should be a relative path not an absolute path

Highly recommended:

The ProcessingModule "camera_sample_frame_counts" should not be added. Its timestamps are non-ascending, and its values are difficult to interpret and apparently superfluous for time alignment of cameras with other data (please confirm @khl02007)
The subject species name should be the latin binomial name, e.g., "Rattus norvegicus" not "rat"
The experimenter name should have one of these forms: 'LastName, Firstname', 'LastName, FirstName MiddleInitial.' or 'LastName, FirstName, MiddleName'

lorenfranklab / rec_to_nwb Goto Github PK

rec_to_nwb's People

Contributors

Stargazers

Watchers

Forkers

rec_to_nwb's Issues

Context

Prerequisite

Expected behavior

Current behavior

Error message

Context

Expected behavior

Current behavior

Error traceback

Recommend Projects

Recommend Topics

Recommend Org