Code Monkey home page Code Monkey logo

workflow-array-ephys's Introduction

Pipeline for extracellular electrophysiology using Neuropixels probe and kilosort clustering method

Build a full ephys pipeline using the canonical pipeline elements

This repository provides demonstrations for:

  1. Set up a workflow using different elements (see workflow_array_ephys/pipeline.py)
  2. Ingestion of data/metadata based on:
  3. Ingestion of clustering results (built-in routine from the ephys element)

Pipeline Architecture

The electrophysiology pipeline presented here uses pipeline components from 4 DataJoint Elements, element-lab, element-animal, element-session and element-array-ephys, assembled together to form a fully functional workflow.

element-lab

element-lab

element-animal

element-animal

assembled with element-array-ephys

element-array-ephys

Installation instruction

Step 1 - clone this project

Clone this repository from here

  • Launch a new terminal and change directory to where you want to clone the repository to
    cd C:/Projects
    
  • Clone the repository:
    git clone https://github.com/datajoint/workflow-array-ephys
    
  • Change directory to workflow-array-ephys
    cd workflow-array-ephys
    

Step 2 - Setup virtual environment

It is highly recommended (though not strictly required) to create a virtual environment to run the pipeline.

  • You can install with virtualenv or conda. Below are the commands for virtualenv.

  • If virtualenv not yet installed, run pip install --user virtualenv

  • To create a new virtual environment named venv:

    virtualenv venv
    
  • To activated the virtual environment:

    • On Windows:

      .\venv\Scripts\activate
      
    • On Linux/macOS:

      source venv/bin/activate
      

Step 3 - Install this repository

From the root of the cloned repository directory: pip install -e .

Note: the -e flag will install this repository in editable mode, in case there's a need to modify the code (e.g. the pipeline.py or paths.py scripts). If no such modification required, using pip install . is sufficient

Step 4 - Jupyter Notebook

  • Register an IPython kernel with Jupyter
    ipython kernel install --name=workflow-array-ephys
    

Step 5 - Configure the dj_local_conf.json

We provided a tutorial notebook 01-configuration to guide the configuration.

At the root of the repository folder, create a new file dj_local_conf.json with the following template:

{
  "database.host": "<hostname>",
  "database.user": "<username>",
  "database.password": "<password>",
  "loglevel": "INFO",
  "safemode": true,
  "display.limit": 7,
  "display.width": 14,
  "display.show_tuple_count": true,
  "custom": {
      "database.prefix": "<neuro_>",
      "ephys_root_data_dir": "<C:/data/ephys_root_data_dir>"
    }
}
  • Specify database's hostname, username, and password properly.

  • Specify a database.prefix to create the schemas.

  • Setup your data directory (ephys_root_data_dir) following the convention described below.

Installation complete

  • At this point the setup of this workflow is complete.

Directory structure and file naming convention

The workflow presented here is designed to work with the directory structure and file naming convention as followed

  • The ephys_root_data_dir is configurable in the dj_local_conf.json, under custom/ephys_root_data_dir variable

  • The subject directory names must match the identifiers of your subjects in the subjects.csv script

  • The session directories can have any naming convention

  • Each session can have multiple probes, the probe directories must match the following naming convention:

    *[0-9] (where [0-9] is a one digit number specifying the probe number)

  • Each probe directory should contain:

    • One neuropixels meta file, with the following naming convention:

      *[0-9].ap.meta

    • Potentially one Kilosort output folder

root_data_dir/
└───subject1/
│   └───session0/
│   │   └───imec0/
│   │   │   │   *imec0.ap.meta
│   │   │   └───ksdir/
│   │   │       │   spike_times.npy
│   │   │       │   templates.npy
│   │   │       │   ...
│   │   └───imec1/
│   │       │   *imec1.ap.meta
│   │       └───ksdir/
│   │           │   spike_times.npy
│   │           │   templates.npy
│   │           │   ...
│   └───session1/
│   │   │   ...
└───subject2/
│   │   ...

We provide an example data set to run through this workflow. The instruction of data downloading is in the notebook 00-data-download.

Running this workflow

For new users, we recommend using the following two notebooks to run through the workflow.

Here is a general instruction:

Once you have your data directory configured with the above convention, populating the pipeline with your data amounts to these 3 steps:

  1. Insert meta information (e.g. subjects, sessions, etc.) - modify:

    • user_data/subjects.csv
    • user_data/sessions.csv
  2. Import session data - run:

    python workflow_array_ephys/ingest.py
    
  3. Import clustering data and populate downstream analyses - run:

    python workflow_array_ephys/populate.py
    
  • For inserting new subjects, sessions or new analysis parameters, step 1 needs to be re-executed.

  • Rerun step 2 and 3 every time new sessions or clustering data become available.

  • In fact, step 2 and 3 can be executed as scheduled jobs that will automatically process any data newly placed into the imaging_root_data_dir.

Interacting with the DataJoint pipeline and exploring data

For new users, we recommend using our notebook 05-explore to interact with the pipeline.

Here is a general instruction:

  • Connect to database and import tables

    from workflow_array_ephys.pipeline import *
    
  • View ingested/processed data

    subject.Subject()
    session.Session()
    ephys.ProbeInsertion()
    ephys.EphysRecording()
    ephys.Clustering()
    ephys.Clustering.Unit()
    
  • If required to drop all schemas, the following is the dependency order. Also refer to 06-drop

    from workflow_array_ephys.pipeline import *
    
    ephys.schema.drop()
    probe.schema.drop()
    session.schema.drop()
    subject.schema.drop()
    lab.schema.drop()
    

Developer Guide

Development mode installation

This method allows you to modify the source code for workflow-array-ephys, element-array-ephys, element-animal, element-session, and element-lab.

  • Launch a new terminal and change directory to where you want to clone the repositories
    cd C:/Projects
    
  • Clone the repositories
    git clone https://github.com/datajoint/element-lab
    git clone https://github.com/datajoint/element-animal
    git clone https://github.com/datajoint/element-session
    git clone https://github.com/datajoint/element-array-ephys
    git clone https://github.com/datajoint/workflow-array-ephys
    
  • Install each package with the -e option
    pip install -e ./workflow-array-ephys
    pip install -e ./element-session
    pip install -e ./element-lab
    pip install -e ./element-animal
    pip install -e ./element-array-ephys
    

Running tests

  1. Download the test dataset to your local machine (note the directory where the dataset is saved at - e.g. /tmp/testset)

  2. Create an .env file with the following content:

    TEST_DATA_DIR=/tmp/testset

    (replace /tmp/testset with the directory where you have the test dataset downloaded to)

  3. Run:

    docker-compose -f docker-compose-test.yaml up --build

workflow-array-ephys's People

Contributors

dimitri-yatsenko avatar kabilar avatar shenshan avatar ttngu207 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.