Code Monkey home page Code Monkey logo

grosselab / viper_hif1alpha Goto Github PK

View Code? Open in Web Editor NEW
6.0 4.0 0.0 13.45 MB

Viper is a Snakemake workflow, aimed at performing the RNA-seq workflow of the paper 'Causes and consequences of a glutamine induced normoxic HIF1 activity for the tumor metabolism', Kappler et al. (2019) in a reproducible and automated manner

License: Other

R 72.26% Python 20.50% Perl 7.24%
rna-seq-pipeline snakemake differential-gene-expression reproducible-research r python

viper_hif1alpha's Introduction

Viper workflow

Viper is a Snakemake workflow, aimed at performing the RNA-seq workflow of the paper 'Causes and Consequences of A Glutamine Induced Normoxic HIF1 Activity for the Tumor Metabolism', Kappler et al. (2019) in a reproducible, automated, and partially contained manner. It is implemented such that alternative or similar analysis can be added or removed.

Viper consists of a Snakefile (workflow/HIF_version_1.0/snakefile), conda environment files (envs/*.yaml), a configuration file (workflow/HIF_version_1.0/config.yaml), a set of R functions (R/*R), and a set of R scripts (scripts/*.R), to perform quality control, preprocessing, differential expression analysis, and functional annotation of RNA-seq data.

By default, the pipeline performs all the steps shown in the diagram below. However, advanced user, you can easily modify the Snakefile and the config.yaml and/or add "custom rules" to enable additional functions. Currently, transcript quantification with Salmon at the read-level or gene quantification by featureCounts can be activated.

Workflow graph

This workflow performs differential expression analysis on paired-end RNA-seq data. After adapter removal with Cutadapt and quality filtering with sickle, reads were mapped with STAR to the humane genome (GRCh38.82), and transcript counts were quantified with salmon. These transcript counts were summarized to gene counts with tximport. Integrated normalization and differential expression analysis were conducted with edegR. Further, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID v6_8 ) for functional annotation of the differential expressed genes.

DAG

Setup the VIPER workflow

Assuming that snakemake and conda are installed (and your system has the necessary libraries to compile R packages), you can use the following commands on a test dataset:

0. Step - clone the githup repository

git clone https://github.com/GrosseLab/ViperWF.git

1. Step - Set up the needed folder and copy files from viper/workflow/HIF_version_1.0

Folder and File Structure

Here is the basic suggested skeleton for your project folder:

  .
  ├── data
  │   ├── qPCR 	            # qPRCR raw data
  │   └ *.fastq.gz 	        # all 'fastq.gz'-files from !...!
  │
  ├── references
  │   └── hg38 	    				      # all data from Homo_sapiens.GRCh38.82
  │       ├ Homo_sapiens.GRCh38.82.gtf 	    				          # annotation
  │       ├ Homo_sapiens.GRCh38.dna.primary_assembly.fa 	        # genome sequence 
  │       └ Homo_sapiens.GRCh38.82.EXON.fa 	    				      # exon sequence of all transcript of GTF
  │	
  ├── logs
  ├── report
  │
  ├── viper 	    				      # Github repository 
  │   ├── report 	    				      # Snakemake report definition
  │   ├── wrapper 	    				      # Snakemake wrapper
  │   ├── rules 	    				      # Snakemake rules
  │   ├── scripts 	    				      # Snakemake scripts
  │   ├── workflow 	    				      # Snakemake final workflows
  │   │	  └ HIF_version_1.0 	    				      #
  │   ├── R 	    				      # R functions needed to run the analysis   
  │   └── man 	    				      # R functions manual
  │
  ├── Snakefile 	    				      # file from ./viper/workflow/HIF_version_1.0
  ├── config.yaml 	    				      # file from ./viper/workflow/HIF_version_1.0
  ├── units.tsv 	    				      # file from ./viper/workflow/HIF_version_1.0
  └── samples.tsv 	    				      # file from ./viper/workflow/HIF_version_1.0

Make folder and copy files from viper/workflow/HIF_version_1.0

mkdir data
mkdir data/qpcr
mkdir references
mkdir logs
mkdir report

cp ./viper/workflow/HIF_version_1.0/Snakefile
cp ./viper/workflow/HIF_version_1.0/config.yaml
cp ./viper/workflow/HIF_version_1.0/units.tsv
cp ./viper/workflow/HIF_version_1.0/samples.tsv

cp ./viper/workflow/HIF_version_1.0/copy.csv ./data/qPCR/
cp ./viper/workflow/HIF_version_1.0/qPCR_data.csv ./data/qPCR/

2. Step - Download data -- will be soon available

Download data from Gene Expression Omnibus (GEO) project GSExxx using the NCBI SRA Toolkit

download sra-files using the 'SRA Run Selector' or SRA Toolkit from https://www.ncbi.nlm.nih.gov/geo/query/XXXX
convert *.sra fiels to *.fastq.gz files usnig fastq-dump form SRA Toolkit 

3. Step - run snakmake

snakemake -kn 
snakemake --create-envs-only   --use-conda
snakemake -k -p --use-conda -j 20

new Folder results

  .
  ├── data
  ├── references
  ├── report 
  ├── viper 	    				      # Github repository 
  │
  ├── logs # include loggings of the snakemake rules   
  ├── results # new folder for the results of the snakemake rules   
  │
  ├── Snakefile 	    				     
  ├── config.yaml 	    				     
  ├── units.tsv 	    				     
  └── samples.tsv 	    				     

viper_hif1alpha's People

Contributors

aehr2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.