Code Monkey home page Code Monkey logo

rnaseq_workflow's People

Contributors

deanpettinga avatar genomics-kl avatar ianbed avatar sgivan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rnaseq_workflow's Issues

update STAR resources

optimize the resources allocated to STAR on HPC3 (120gb, 8ppn) to enable multiple jobs on a single node.

interesting additions

Would be nice to print the GO.XX tables that go into the clusterprofiler functions.
Would also be nice to toggle enrichment between up/down genes.
just some thoughts

Build STAR index?

Might be advantageous to automate the STAR index building so this is also hands-off.

PE or SE

enable user to provide PE or SE reads in units.tsv by simply including or omitting fq2

ensembl mart always human

need to incorporate the config.yaml species information into the .Rmd so it will download the correct mart for annotation ens2gene, etc.

update DE test to glmTreat

original workflow included DE test of all genes followed by filtering for FDR <0.05 and LogFC>1.5. This means FDR takes all genes into account -> even though we are only interested only in LogFC >1.5, FDR accounts for genes with logFC<1.5, so we are unnecessarily punished for these tests of genes that we already assume have no biological relevance.

Deprecate clusters.yaml

Specify memory and threads in the rules themselves. This is easier to read and makes it more convenient to add new rules.

Unnecessary file copying in `mergeLanesAndRename.R`

mergeLanesAndRename.R, accounts for projects where samples were sequenced across lanes by running cat on them. However, the majority of our projects are 1 file to 1 sample so a simple renaming using ln -s samplename.fastq.gz samplename-SE.fastq.gz would suffice to make the pipeline work if we test first to see if there is only one file per sample.

We could consider also running cat at the trim_galore step: trim_galore <(cat lane1_R1 lane2_R1 lane3_R1) <(cat lane1_R2 lane2_R2 lane3_R2). This is a bigger change to the pipeline and may require more debugging but would do away with the file copying even when the data are split across lanes.

I can work on this the next time I use this pipeline, but other people can feel free to tackle this if you are interested.

Log the version of each program

Log the version of each program that is actually implemented in the workflow. This information is useful for analyst reference.

Possible documentation locations:

  • logs/
  • within the diffExp.Rmd/.html report.
  • elsewhere?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.