cortex-lab / phylib Goto Github PK

View Code? Open in Web Editor NEW

11.0 12.0 23.0 527 KB

Lightweight electrophysiological data analysis library

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.19% Python 99.81%

python data-analysis electrophysiology

phylib's Introduction

phylib

Electrophysiological data analysis library used by phy, a spike sorting visualization software, and ibllib.

phylib's People

Contributors

Stargazers

Watchers

phylib's Issues

Support multiple file formats for the raw data

Minor shift in y axis between raw/mean/template

This is a cosmetic detail, but in the WaveformView, when switching from snippets to mean or to template, the baseline of all the curves is shifted along the y axis. This is not a major issue, maybe because of automatic rescaling or something, it just makes it harder to appreciate differences between curves

_load_ function problems

Need to from glob import glob for _load_metadata

phylib/phylib/io/model.py

Lines 321 to 324 in f72d0a5

    
           def _load_metadata(self): 
        
               """Load cluster metadata from all CSV/TSV files in the data directory.""" 
        
               files = list(self.dir_path.glob('*.csv')) 
        
               files.extend(self.dir_path.glob('*.tsv'))

self.dir_path does not seem to be recognised as a Path object in _load_spike_samples

phylib/phylib/io/model.py

Lines 426 to 429 in f72d0a5

    
           def _load_spike_samples(self): 
        
               # WARNING: "spike_times.npy" is in units of samples. Need to 
        
               # divide by the sampling rate to get spike times in seconds. 
        
               path = self.dir_path / 'spike_times.npy'

self.dir_path = Path(dir_path) didn't work in the constructor for me, so I used this non-ideal work around.

path = self.dir_path +'/'+ 'spike_times.npy'
path = Path(path)

phy GUI opens after this, but then produces an assertion error:

c:\conda\miniconda3\envs\phydev\lib\site-packages\numpy\core\_methods.py:75: RuntimeWarning: invalid value encountered in reduce
  ret = umr_sum(arr, axis, dtype, out, keepdims)
c:\conda\miniconda3\envs\phydev\lib\site-packages\numpy\lib\function_base.py:3405: RuntimeWarning: Invalid value encountered in median
  r = func(a, **kwargs)
c:\conda\miniconda3\envs\phydev\lib\site-packages\phylib\utils\geometry.py:161: RuntimeWarning: invalid value encountered in less
  assert np.all(data_bounds[:, 1] < data_bounds[:, 3])

...
Traceback (most recent call last):
  File "c:\conda\miniconda3\envs\phydev\lib\site-packages\phy\apps\template\gui.py", line 723, in init_cluster_ids
    view.plot()
  File "c:\conda\miniconda3\envs\phydev\lib\site-packages\phy\cluster\views\template.py", line 158, in plot
    self._plot_templates(bunchs, data_bounds=data_bounds)
  File "c:\conda\miniconda3\envs\phydev\lib\site-packages\phy\cluster\views\template.py", line 103, in _plot_templates
    self.visual.add_batch_data(**data, data_bounds=data_bounds)
  File "c:\conda\miniconda3\envs\phydev\lib\site-packages\phy\plot\base.py", line 142, in add_batch_data
    data = self.validate(**kwargs)
  File "c:\conda\miniconda3\envs\phydev\lib\site-packages\phy\plot\visuals.py", line 296, in validate
    data_bounds = _get_data_bounds(data_bounds, length=n_signals)
  File "c:\conda\miniconda3\envs\phydev\lib\site-packages\phylib\utils\geometry.py", line 193, in _get_data_bounds
    _check_data_bounds(data_bounds)
 File "c:\conda\miniconda3\envs\phydev\lib\site-packages\phylib\utils\geometry.py", line 161, in _check_data_bounds
    assert np.all(data_bounds[:, 1] < data_bounds[:, 3])
AssertionError

phy crashes with nans in data_bounds

Get spike depths without pc_features.npy or pc_features_ind.npy

Kilosort v3.0 does not generate pc_features.npy or pc_Features_ind.npy, however this is required in order for get an estimate of the spike depths in phylib.io.model.get_depths(). These depths are required in order to run some iblapps (e.g. https://github.com/int-brain-lab/iblapps/tree/master/atlaselectrophysiology).

Do you know if there are any alternative ways to extract spike depths via phylib, so that kilosort v3.0 output can be used in these iblapps?

phy extract-waveforms saves waveforms with wrong dtype if raw data file is encoded as float32

phy extract-waveforms saves waveforms with the wrong dtype if the raw data file is encoded as float32.

Steps to reproduce:

Download and unzip the example dataset: https://drive.google.com/file/d/1mshkvPaxKpHjWK4z67HtfXlUvyuprX9B/view?usp=sharing
Navigate to the folder you saved it to in your command line
Run phy extract-waveforms params.py
Start python and run the following commands:

import numpy as np
np.load('_phy_spikes_subset.waveforms.npy')

Expected behaviour:

The waveforms are loaded

Actual behaviour:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\jmb9770\Anaconda3\envs\phy\Lib\site-packages\numpy\lib\npyio.py", line 456, in load
    return format.read_array(fid, allow_pickle=allow_pickle,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\jmb9770\Anaconda3\envs\phy\Lib\site-packages\numpy\lib\format.py", line 839, in read_array
    array.shape = shape
    ^^^^^^^^^^^
ValueError: cannot reshape array of size 31004592 into shape (63534,61,16)

Environment info:

OS: Windows 10 x64
Python verison: 3.11.9
Conda verison: 23.3.1
phy version: 2.0b5
phylib version: 2.4.3

Additional info:

The culprit appears to be on line 657 of phylib/io/traces.py, where the dtype of the waveforms is inferred if sample2unit is None, else set to float. The phy command extract-waveforms never sets sample2unit, so it always defaults to 1.0, and hence the written waveforms have dtype float, which on most modern python installations means float64. If raw data file from which the waveforms are loaded is of integer type, the multiplication by 1.0 will coerce them to float64, hence they will be written correctly. If the raw data file is of type float32, however, no such coercion will take place and the NpyWriter will byte-copy the float32-encoded waveforms to the waveforms .npy file that claims to have dtype float64 in its header.

new release on pypi in advance of pykilosort release?

Can not open binary file with esoteric extension ?

When opening phy with a raw binary file that has the .raw extension, I have the following error

klass, arg, kwargs = _get_ephys_constructor(obj[0], **kwargs)
OSError: [Errno Unknown file extension: %s.] .raw

Can it be bypassed?

error getting n_spikes from get_cluster_spike_waveforms

Hi! I am trying to write some code to filter out low n_spikes in my phy input data. I have tried this following code:
waveforms = model.get_cluster_spike_waveforms(cluster_id)
n_spikes, n_samples, n_channels_loc = waveforms.shape

This works for most cluster_ids, but there are a few cluster_ids where I keep getting this assertion error:

Any advice on how to fix this?

if spike_clusters.npy doesn't exist, fails to load template-gui

in phylib/io/model.py, line 414, when spike_clusters.npy doesn't exist it is not created properly because 'path' is not defined.
I'm not sure if is the intended behaviour but adding:
path = self.dir_path + '/spike_clusters.npy'
at line 414 allows template gui to load correctly

Single template

Hi,

I used spyking-circus to "sort" a single unit which gave me a single template.
I was having trouble to load this dataset into phy and I tracked down the problem to this line

phylib/phylib/io/model.py

Line 476 in 3e4dfb9

return read_array(path, mmap_mode=mmap_mode).squeeze()

The problem arise because read_array returns an array of shape (1,91,4) and then the squeeze method changes it to just (91,4).
And then this line gives the error

phylib/phylib/io/model.py

Line 710 in 3e4dfb9

assert cols.shape == (n_templates, n_channels_loc)

just after 'Loading templates' and 'Templates are sparse'

I tried exportingfrom spyking-circus to phy without sparse matrices and now it fails in this line

phylib/phylib/io/model.py

Line 434 in 3e4dfb9

assert self.similar_templates.shape == (self.n_templates, self.n_templates)

after 'Loading the inverse of the whitening matrix.'

If I remove the 'squeze' I can load this dataset of only one template, and also I didn't have problems with any other dataset sorted with spyking-circus, but I can not load anymore datasets sorted with kilosort2.5.
With kilosort output it fails in this line, just after 'Start capturing exceptions'

phylib/phylib/io/model.py

Line 647 in 3e4dfb9

assert samples.ndim == times.ndim == 1

Bug for sparse export

In io/model.py, _get_template_sparse(), L729

channel_ids = channel_ids.astype(np.uint32)

should be put after L733

channel_ids = channel_ids[used]

in order to work. Otherwise, the conversion to uint32 is messing around with -1, thus leading to errors. I can make a PR, but this is just a matter of shifting one line of code

Use dask internally for virtual array concatenation

bug when run from pykilosort

in pykilosort/main.py, l.38: get_ephys_reader is instanciated

raw_data = get_ephys_reader(dat_path)

The default parameter in init() of this class for n_channel is None.

This gets fed to _memmap_flat() in phylib/io/traces.py, l.310

where it is compared with 0: assert n_channels > 0, phylib/io/traces.py, l.154

Full log:

Traceback (most recent call last):
  File "<ipython-input-10-e3f4401b61dd>", line 4, in <module>
    run(dat_path=dat_path, probe=probe, params=params, dir_path=dir_path)
  File "C:\Users\Maxime\Desktop\pyKilosort\pykilosort\pykilosort\main.py", line 38, in run
    raw_data = get_ephys_reader(dat_path)
  File "c:\users\maxime\anaconda3\envs\pykilosort\lib\site-packages\phylib\io\traces.py", line 490, in get_ephys_reader
    return klass(arg, **kwargs)
  File "c:\users\maxime\anaconda3\envs\pykilosort\lib\site-packages\phylib\io\traces.py", line 310, in __init__
    self._mmaps = [
  File "c:\users\maxime\anaconda3\envs\pykilosort\lib\site-packages\phylib\io\traces.py", line 311, in <listcomp>
    _memmap_flat(path, dtype=dtype, n_channels=n_channels, offset=offset)
  File "c:\users\maxime\anaconda3\envs\pykilosort\lib\site-packages\phylib\io\traces.py", line 154, in _memmap_flat
    assert n_channels > 0
TypeError: '>' not supported between instances of 'NoneType' and 'int'

TemplateFeature sparse->full transform messed up if there are < 32 clusters

If there are less than 32 clusters, the template_feature_ind.npy matrix that's used to translate template_feature.npy data from sparse into full has repeated rows. This makes it so that the assignment here is messed up. It seems to only happen if splitAllClusters.m in Kilosort 2 or 2.5 is run. I didn't get to the bottom of why that is. Something to do with the template projections not getting recomputed I guess.

But in any case I think it's easily fixed by adding some logic on line 119 of model.py:

    if n_channels_loc > n_channels:
        # If there are fewer templates (n_channels here) than the number of 
        # mapping indicies (32 for template projections by KiloSort), 
        # limit the indexing to avoid an incorrect mapping
        out[x[:, :n_channels, ...], cols_loc[:, :n_channels, ...], ...] = data[:, :n_channels, ...]
    else:
        out[x, cols_loc, ...] = data

Not really sure what else this thing does so I didn't want to put in a pull request.

Here's what it looks like before the fix. You can see cluster 9 is overlapped with the noise cluster.

Here's after the fix:

Adding a Refractory Period Violation number somewhere

How easy is it to add an extra column to the ClusterView? Because I saw you added the firing rates, which is great, but what about Refractory Period Violation? I.e. what are the percentage of spikes (among all the spikes for a given units) such that the interspike interval is below a certain number (2 ms for example). This is something quite useful, to quickly identify MUA (since they have high level of RPV).
And more generally, are columns hard-coded somewhere, or is there a way to provide custom functions for sorting the cells?

	def _load_metadata(self):
	"""Load cluster metadata from all CSV/TSV files in the data directory."""
	files = list(self.dir_path.glob('*.csv'))
	files.extend(self.dir_path.glob('*.tsv'))

	def _load_spike_samples(self):
	# WARNING: "spike_times.npy" is in units of samples. Need to
	# divide by the sampling rate to get spike times in seconds.
	path = self.dir_path / 'spike_times.npy'

cortex-lab / phylib Goto Github PK

phylib's Introduction

phylib

phylib's People

Contributors

Stargazers

Watchers

Forkers

phylib's Issues

Steps to reproduce:

Expected behaviour:

Actual behaviour:

Environment info:

Additional info:

Recommend Projects

Recommend Topics

Recommend Org