Code Monkey home page Code Monkey logo

ngsclean's Introduction

NGSclean

A pipeline to trim the reads and remove ribosomal RNA from RNA-Seq data

This pipeline trims the adaptor from the ends of reads, and move plant Ribosomal reads from RNAseq reads. (This tool also works for data from animal species, when corresponding rRNA sequences is provide.) The process is based on two tools, Trimmomatic and STAR for trimming and mapping respectively.

First of all please make sure how to run these two tools on your system.

On Sapelo/UGA, it is like this:
Trimmomatic

module load ava/jdk1.8.0_20 
/usr/local/apps/trimmomatic/0.33/trimmomatic-0.33.jar 

STAR

modle load java/jdk1.7.0_67
/usr/local/apps/star/latest/bin/STAR

Dependency:
Trimmomatic http://www.usadellab.org/cms/?page=trimmomatic
STAR https://github.com/alexdobin/STAR

0. Copy the scripts and generate index file for rRNA sequences

copy NGSclean directory to your system, for example
/lustre1/lxue/NGSclean

prepare rRNA reference for STAR

NGSclean=/lustre1/lxue/NGSclean 

cd $NGSclean
mkdir rRNA_ref
module load java/jdk1.7.0_67
/usr/local/apps/star/2.4.2a/bin/STAR \
  --runThreadN 2  \
  --runMode genomeGenerate  \
  --genomeDir rRNA_ref  \
  --genomeChrBinNbits  5 \
  --genomeFastaFiles rRNA_only_NR.fas  

Keep '--genomeDir rRNA_ref' as it is change 'rRNA_only_NR.fas' if a new reference file is necessary.

1. Prepare design file

cd /lustre1/lxue/RNAseq/01clean  
NGSclean=/lustre1/lxue/NGSclean  
python $NGSclean/generate_design_file.py -f /lustre1/lxue/RNAseq/00reads -d RNAseq_design.txt -p 

Check the design file. Revise the sample names if necessary, it will be used as a prefix for fastq files
more RNAseq_design.txt 

2. Trim and Clean

cd /lustre1/lxue/RNAseq/01clean
settings
NGSclean=/lustre1/lxue/NGSclean
trimmoFull=/usr/local/apps/trimmomatic/0.33/trimmomatic-0.33.jar 
starFull=/usr/local/apps/star/latest/bin/STAR
adaptor=/usr/local/apps/trimmomatic/latest/adapters/TruSeq3-PE.fa
trimmo_module=java/jdk1.8.0_20 
star_module=java/jdk1.7.0_67

generate shell files

-s how to treat the singleton reads:merge, keep , discard
-q queue to run the jobs: queue(defualt), inter(run interactive jobs)

python $NGSclean/trim_and_clean.py -d RNAseq_design.txt -t 8  -s merge \
  --run_trimmomatic $trimmoFull --load_trimmo_module $trimmo_module  --adaptor $adaptor  \
  --run_star $starFull --load_star_module $star_module 

python $NGSclean/trim_and_clean.py -d RNAseq_design.txt -t 8  -s keep \
  --run_trimmomatic $trimmoFull --load_trimmo_module $trimmo_module  --adaptor $adaptor  \
  --run_star $starFull --load_star_module $star_module 

python $NGSclean/trim_and_clean.py -d RNAseq_design.txt -t 8  -s discard -q inter \
  --run_trimmomatic $trimmoFull --load_trimmo_module $trimmo_module  --adaptor $adaptor  \
  --run_star $starFull --load_star_module $star_module 

And then run the master shell script to submit the jobs or run them interactively.
For example:

chmod 750 RNAseq_design.sh
./RNAseq_design.sh

ngsclean's People

Contributors

liangjiaoxue avatar

Stargazers

Panqiao Wang avatar  avatar

Watchers

James Cloos avatar  avatar

Forkers

wangpanqiao

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.