Code Monkey home page Code Monkey logo

roryk-bcbiosinglecell's Introduction

bcbioSingleCell

Travis CI Codecov Project Status: Active - The project has reached a stable, usable state and is being actively developed.

R package for bcbio single-cell RNA-seq analysis.

Installation

## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("devtools")
biocLite("remotes")
biocLite("GenomeInfoDbData")
biocLite(
    "hbc/bcbioSingleCell",
    dependencies = c("Depends", "Imports", "Suggests")
)

Load bcbio run

library(bcbioSingleCell)
bcb <- bcbioSingleCell(
    uploadDir = "bcbio_indrop/final",
    interestingGroups = c("genotype", "treatment"),
    sampleMetadataFile = "sample_metadata.csv",
    organism = "Homo sapiens",
    ensemblRelease = 90L
)
# Back up all data inside bcbioSingleCell object
flat <- flatFiles(bcb)
saveData(bcb, flat, dir="data")

This will return a bcbioSingleCell object, which is an extension of the Bioconductor SingleCellExperiment container class.

Parameters:

  • uploadDir: Path to the bcbio final upload directory.
  • interestingGroups: Character vector of the column names of interest in the sample metadata, which is stored in the sampleData() accessor slot of the bcbioSingleCell object. These values should be formatted in camelCase, and can be reassigned in the object after creation (e.g. interestingGroups(bcb) <- c("batch", "age")). They are used for data visualization in the quality control utility functions.
  • organism: Organism name. Use the full latin name (e.g. "Homo sapiens").
  • genomeBuild: Optional, the Ensembl release version to use.
  • gffFile: Optional. If your transcriptome does not entirely match up to an Ensembl release, you can pass the GTF file of the transcriptome you used to load the annotations instead.

Consult help("bcbioSingleCell", "bcbioSingleCell") for additional documentation.

Sample metadata examples

FASTQ files with samples multiplexed by index barcode

This is our current recommended method for analyzing an inDrops dataset. The sample index barcodes are multiplexed per FASTQ set. For Illumina sequencing data, the raw binary base call (BCL) data must be converted into FASTQs (split into R1-R4 files) using bcl2fastq.

The inDrops library version is automatically detected by bcbio, but ensure that the sample index sequences provided match the library version when attempting to create a bcbioSingleCell object. A current list of inDrops v3 index barcodes is available from seqcloud.

Consult the bcbio documentation for more information on how to configure an inDrops run prior to loading into R with the bcbioSingleCell() function.

description index sequence sampleName
indrops1 17 GGAGGTAA sample1
indrops1 18 CATAACTG sample2
indrops2 12 GCGTAAGA sample3
indrops2 13 CTATTAAG sample4
indrops2 14 AAGGCTAT sample5
indrops2 15 GAGCCTTA sample6
indrops2 16 TTATGCGA sample7

FASTQ files demultiplexed per sample

This is our current method for handling 10X Genomics Cell Ranger output (using readCellRanger()) and Illumina SureCell sample data.

description genotype
sample1 wildtype
sample2 knockout
sample3 wildtype
sample4 knockout

Troubleshooting

Maximal number of DLLs reached

Error: package or namespace load failed for 'bcbioSingleCell' in dyn.load(file, DLLpath = DLLpath, ...):
  maximal number of DLLs reached...

Depending on your operating system, you may encounter this error about hitting the DLL limit in R. This issue is becoming more common as RNA-seq analysis packages grow increasingly complex. Luckily, we can configure R to increase the DLL limit. Append this line to your ~/.Renviron file:

R_MAX_NUM_DLLS=150

For more information on this issue, consult help("dyn.load") in the R documentation. The number of loaded DLLs in an R session can be obtained with getLoadedDLLs().

References

The papers and software cited in our workflows are available as a shared library on Paperpile.

roryk-bcbiosinglecell's People

Contributors

mjsteinbaugh avatar roryk avatar

Stargazers

 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.