Code Monkey home page Code Monkey logo

destin's Introduction

Destin

This is the Jiang lab scATAC-seq processing and cell clustering pipeline.

Manuscript

Urrutia, Eugene, et al. "Destin: toolkit for single-cell analysis of chromatin accessibility." bioRxiv (2018): 461905. link

Questions & issues

If you have any questions or problems when using destin, please feel free to open a new issue here. You can also email the maintainers of the corresponding packages -- the contact information is below.

Installation

The bioinformatic pipeline requires cloning the git repostory from github, where yourPathToDestinRepo is your path to the local cloned repository

Running the vignettes also requires cloning the git repository

cd yourPathToDestinRepo
git clone https://github.com/urrutiag/destin.git

install dependencies

installed <- rownames(installed.packages())
pkgs = c("cluster", "data.table", "ggplot2",
         "gridExtra", "irlba",  "Matrix",
         "parallel", "Rtsne")
pkgs <- setdiff(pkgs, installed)
if (length(pkgs))
  install.packages(pkgs, dep=c("Depends", "Imports"))

biocPkgs = c("ChIPpeakAnno", "GenomicAlignments", "rtracklayer")
biocPkgs <- setdiff(biocPkgs, installed)
if (length(biocPkgs)) {
  if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
  BiocManager::install(biocPkgs)
}

#ClusterR is an optional package:
if ( ! "ClusterR" %in% rownames(installed.packages() ) )
  install.packages("ClusterR", dep=c("Depends", "Imports"))

Running the R package requires either installing from the above git repostory locally

install.packages("yourPathToDestinRepo/package", repos = NULL, type = "source")
library(destin)

or downloading from github directly (note that this will not allow for the bioinformatics pipeline or the vignettes):

install.packages("devtools")
devtools::install_github("urrutiag/destin/package")
library(destin)

Dependencies

  • software: SRAtoolkit, cutadapt, BOWTIE2, samtools, picard, MACS2, bedtools, awk, R, python

  • R packages:
    ChIPpeakAnno, cluster, data.table, GenomicAlignments, ggplot2, gridExtra, irlba, Matrix, parallel, rtracklayer, Rtsne

  • Optional R packages: ClusterR

Overview

Bioinformatics Pipeline

Input: fastq files of entire experiment or individual fastq files by cell, set of 2 for each of paired reads

Output: bam files by cell, peaks file


  • download fastq

  • separate fastq by cell (if combinatorial indexed)

  • cut adapters
  • align
  • sam to bam
  • sort
  • Add read group and index
  • mark duplicates
  • remove mitochondrial, unmapped and chr Y
  • adjust for Tn5 insertion
  • alignment quality >= 30
  • index

  • call peaks (p < 0.01)
  • filter blacklist

Destin

Input: bam files by cell, peaks file

Output: cluster membership, differential accessibility

  • create ranged summarized experiment from bam files and peaks file
  • append experimental information if available
  • annotate regions
  • quality control on cells and regions
  • determine number of clusters
  • cluster cells by destin which optimizes hyperparameters via multinomial likelihood
  • calculate differential accessibility

GWAS association

Determine whether GWAS results are associated with increased chromatin accessibility in a particular cell type cluster. We utilize 2 methods originally developed for scRNA-seq expression: ECWE and MAGMA.

Example Workflows

  • Bioinformatics and Clustering: Buenrostro mouse cells, Fluidigm microfluidic technology html markdown

  • Clustering: Preissl P56 forebrain mouse cells, combinatorial barcode technology html markdown

  • GWAS cell-type specific association: Preissl P56 forebrain mouse cells html markdown

  • Read 10x genomics scATAC-seq PBMC data and cluster html markdown

Citations

de Leeuw, C. A., et al. (2015). Magma: generalized gene-set analysis of gwas data. PLoS comput. biol., 11 (4), e1004219.

Skene, N. G. et al. (2016). Identification of vulnerable cell types in major brain disorders using single cell transcriptomes and expression weighted cell type enrichment. Front. neurosci-switz,10, 16.

Developers & Maintainers

  • Gene Urrutia (gene dot urrutia at gmail dot com)
    Hill-Rom Innovation, Cary, NC

  • Yuchao Jiang (yuchaoj at email dot unc dot edu)
    Department of Biostatistics & Department of Genetics, UNC-Chapel Hill

destin's People

Contributors

urrutiag avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.