Code Monkey home page Code Monkey logo

dbit-seq's Introduction

DBiT-seq

This is a public repository for all code connected to DBiT-Seq (microfluidic Deterministic Barcoding in Tissue for spatial omics sequencing).

Please cite: Yang et al. High-Spatial-Resolution Multi-Omics Atlas Sequencing of Mouse Embryos via Deterministic Barcoding in Tissue. bioRxiv 2019: doi: https://doi.org/10.1101/788992.

Schematic workflow

foo bar

All raw and processed files are available at GEO (GSE137986)

Pre-processing

This is code for quality control and reformating the read file for compatibility with st-pipeline.

In our datasets, read2 contains the barcode and UMI, so we need to reformat the read file for compatibility with st-pipeline.

foo bar

To reformat the read file, run

perl reformat.pl -indir 01.rawdata -outdir 02.reformatdata -sample 10t

Generate gene expression matrix

To generate the expression matrix file, run st-pipeline(v1.7.2):

sample=10t
FW=/02.reformatdata/$sample/$sample.R1.fastq.gz
RV=/02.reformatdata/$sample/$sample.R2.fastq.gz
MAP=/database/GRCm38_86/StarIndex
ANN=/database/GRCm38_86/gencode.vM11.annotation.gtf
CONT=/database/GRCm38_86/ncRNA/StarIndex
ID=barcodes.xls
OUTPUT=/03.stpipeline/$sample
mkdir -p /03.stpipeline/$sample
TMP=/03.stpipeline/$sample/tmp
mkdir -p /03.stpipeline/$sample/tmp
EXP=$sample

st_pipeline_run.py \
  --output-folder $OUTPUT \
  --temp-folder $TMP \
  --umi-start-position 16 \
  --umi-end-position 26 \
  --ids $ID \
  --ref-map $MAP \
  --ref-annotation $ANN \
  --expName $EXP \
  --htseq-no-ambiguous \
  --verbose \
  --mapping-threads 16 \
  --log-file $OUTPUT/${EXP}_log.txt \
  --two-pass-mode \
  --no-clean-up \
  --contaminant-index $CONT \
  --disable-clipping \
  --min-length-qual-trimming 30 \
  $FW $RV
convertEnsemblToNames.py \
  --annotation /database/GRCm38_86/gencode.vM11.annotation.gtf \
  --output /03.stpipeline/$sample/$sample\_stdata.updated.tsv \
  /03.stpipeline/$sample/$sample\_stdata.tsv

To visulize the expression map, run a modified st_qa-new.py script:

st_qa-new.py --input-data /03.stpipeline/$sample/$sample\_stdata.updated.tsv

To derive the expression map of individual mRNA genes, we conducted global normalization by “Scran” and then use log scale transformation:

st_data_plotter.py --normalization Scran --show-genes Notch1 --image-files 10t.png --counts-table-files 10t.under-tissue.tsv --use-log-scale --dot-size 8

Image alignment in Adobe Illustrator(AI)

Now that we have an stdata file containing all the gene expression data for all the squares, we would like to remove all the squares that are not located under the tissue.

  1. Transform the location and it's reads count from the expression matrix to Scalable Vector Graphics(SVG) format. SVG images and their related behaviors are defined in XML text files, which means they can be freely edited, searched, indexed and scripted.
  2. Open the image file and SVG file in Illustror, manully alignment the squares on image and SVG files.
  3. Turn off visibility of the image layer, use Selection Tool to select all all the squares that are not located under the tissue, and delete them.
  4. Save the SVG image into a XML text file, which is contain all the locations of the squares that under the tissue.
  5. Extract these location from the expression matrix data.

foo bar

Spatial differential expression analysis

This is code for differential expression analysis.

Figure 2G: use the “st_qa.py” scripts in st-pipeline to do the quality assemssment

Figure 3B: Spatially variable genes generated by SpatialDE was used to conduct the clustering analysis, Non-negative matrix factorization(NMF) was performed using the NNLM pacakges in R, after the raw values were log-transformed, we chose k of 11 for the mouse embryo DBiT-seq transcriptome data obtained at a 50μm pixel size. For each pixel, the largest factor loading from NMF was used to assign cluster membership. NMF clustering of pixels was plotted by tSNE using the package “Rtsne” in R.

dbit-seq's People

Contributors

mingyuyang-yale avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.