Code Monkey home page Code Monkey logo

drafts's Introduction

DRAFTS

DNA Regulatory element Analysis by cell-Free Transcription and Sequencing

Code and materials from paper "Multiplex transcriptional characterizations across diverse bacterial species using cell-free systems" Yim SS*, Johns NI*, Park J, Gomes ALC, McBee RM, Richardson M, Ronda C, Chen SP, Garenne D, Noireaux V, Wang HH. Molecular Systems Biology (2019) 15, e8875. *denotes equal contribution

The full paper and supplementary information can be accessed here.

Raw sequencing data can be found at NCBI SRA under PRJNA509603.

dependencies

The following must be installed prior to executing the code in this repository. For Python packages, it may be convenient to obtain these through a distribution such as Anaconda. Installation should only take a few minutes.

  • python 3.6.X, ipython/jupyter
    • biopython
    • pandas
    • numpy
    • scipy
    • matplotlib
    • seaborn
  • bbmerge

1. processing of raw sequencing data

01_DRAFTS_process_raw.sh

  • expects nextseq/miseq raw data folder, where each folder has 2 files of R1 and R2 (paired-end reads) and files sequenced from different lanes of flowcell are separated in four different folders labeled with _L00n
  • assumes foldernames are Samplename_L001_, Samplename_L002_, Samplename_L003_, or Samplename_L004_ Samplename here is SampleID for each sample in the sample sheet for illumna sequencing run

run 01_DRAFTS_process_raw.sh 1) to find and combine raw nextseq data in search_dir, 2) unzip them to the out_dir, then 3) assemble paired-end reads

bash 01_DRAFTS_process_raw.sh [search_dir] [out_dir (optional)]

after running 01_DRAFTS_process_raw.sh, group DNA-seq and RNA-seq reads in seperate folders for further analysis

2. error filtering and barcode counting

02_DRAFTS_extract_data.py

  • out_dir should contain a folder named 01_bccounts with 2 empty folders insde named [01_dna_bccounts, 02_rna_bccounts],
  • and a folder named 02_log with 10 empty folders inside named [01_bccounts, 02_lowq, 03_missingadapter, 04_badbc, 05_goodbc_badalign, 06_frag, 07_goodbc_perfectalign, 08_goodbc_goodalign, 09_goodbc_perfectalign_bccounts, 10_goodbc_goodalign_bccounts, 11_log_files]

run 02_DRAFTS_extract_data.py to 1) filter errors in oligo library synthesis or sequencing, 2) extract barcode counts and 3) other info for qc and additional analysis

python 02_DRAFTS_extract_data.py [ref_csv] [dna_directory] [rna_directory] [out_dir]

3. calculation of transcription levels

03_DRAFTS_compute_tx.py

  • out_dir should contain a folder named 01_tx

run 03_DRAFTS_compute_tx.py to 1) compute abundances of DNA and RNA barcode counts and 2) transcription levels

python 03_DRAFTS_compute_tx.py [ref_csv] [dna_bc_directory] [rna_bc_directory] [out_dir]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.