Code Monkey home page Code Monkey logo

uorf-tools's Introduction

uORF-Tools

uORF-Tools are a workflow and a collection of tools for the analysis of 'upstream Open Reading Frames' (short uORFs). The workflow is based on the workflow management system snakemake and handles installation of all dependencies, as well as all processings steps. The source code of uORF-Tools is open source and available under the GPL-3 License. Installation is described below, for usage please refer to the Userguide.

Installation via bioconda

uORF-Tools requires snakemake (Version=5.4.5), which can be installed with all dependencies via conda. Once you have conda installed simply type:

    $ conda create -c conda-forge -c bioconda -n snakemake snakemake==5.4.5

    $ source activate

    $ source activate snakemake

Create a project directory and change into it:

     $ mkdir project

     $ cd project

Retrieve the uORF-Tools from GitHub:

     $ git clone https://github.com/Biochemistry1-FFM/uORF-Tools.git

Now you can get started. Usage of the workflow is described in the Userguide.

uorf-tools's People

Contributors

anibunny12 avatar biochemistry1-ffm avatar eggzilla avatar rickgelhausen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

uorf-tools's Issues

Error in rule sizeFactors: jobid: 21

I am also facing the same problem using C. Elegans genome from ensembl please help

Error in rule sizeFactors:
jobid: 21
output: uORFs/sfactors_lprot.csv
log: logs/sizeFactors.log (check log file(s) for error message)
conda-env: /home/ngslab/ashish/project/.snakemake/conda/5bc873a8
shell:
mkdir -p uORFs; uORF-Tools/scripts/generate_size_factors.R -t uORF-Tools/samples.tsv -b maplink/ -a uORFs/longest_protein_coding_transcripts.gtf -s uORFs/sfactors_lprot.csv 2> logs/sizeFactors.log;

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/ngslab/ashish/project/.snakemake/log/2020-04-04T005245.080262.snakemake.log

I also added ##format: gtf still it is not working

error in rule sizeFactors

Hello,

I'm quite excited to use uORF-Tools. It seems to run fine at the beginning but I'm getting the following error after about 8% of jobs are done:

`[Wed May 15 09:22:47 2019]
rule sizeFactors:
input: uORFs/longest_protein_coding_transcripts.gtf, maplink/RIBO-ISRIB-1.bam, maplink/RIBO-ISRIB-2.bam, maplink/RIBO-ISRIB-3.bam, maplink/RIBO-vehicle-1.bam, maplink/RIBO-vehicle-2.bam, maplink/RIBO-vehicle-3.bam
output: uORFs/sfactors_lprot.csv
jobid: 21

Activating conda environment: /Users/mo/project/.snakemake/conda/54de1460
Loading required package: methods
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
colnames, colSums, dirname, do.call, duplicated, eval, evalq,
Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int,
pmin, pmin.int, Position, rank, rbind, Reduce, rowMeans, rownames,
rowSums, sapply, setdiff, sort, table, tapply, union, unique,
unsplit, which, which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: DelayedArray
Loading required package: matrixStats

Attaching package: ‘matrixStats’

The following objects are masked from ‘package:Biobase’:

anyMissing, rowMedians

Loading required package: BiocParallel

Attaching package: ‘DelayedArray’

The following objects are masked from ‘package:matrixStats’:

colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following objects are masked from ‘package:base’:

aperm, apply

Loading required package: Biostrings
Loading required package: XVector

Attaching package: ‘Biostrings’

The following object is masked from ‘package:DelayedArray’:

type

The following object is masked from ‘package:base’:

strsplit

Loading required package: Rsamtools

Attaching package: ‘plyr’

The following object is masked from ‘package:XVector’:

compact

The following object is masked from ‘package:matrixStats’:

count

The following object is masked from ‘package:IRanges’:

desc

The following object is masked from ‘package:S4Vectors’:

rename

Error in while (grepl("^#", line)) { : argument is of length zero
Calls: import.gff ... import -> import -> import -> .local -> .sniffGFFVersion
Execution halted
[Wed May 15 09:23:03 2019]
Error in rule sizeFactors:
jobid: 21
output: uORFs/sfactors_lprot.csv
conda-env: /Users/mo/project/.snakemake/conda/54de1460
shell:
mkdir -p uORFs; uORF-Tools/scripts/generate_size_factors.R -t uORF-Tools/samples.tsv -b maplink/ -a uORFs/longest_protein_coding_transcripts.gtf -s uORFs/sfactors_lprot.csv;
`

I can't exactly seem to figure out why exactly the argument is of length zero in the grep function. What am I missing? I'm working with Mus musculus GRC38.92 Ensembl genome and corresponding gtf file.

Morgane

MissingInputException

Hello,
I would like to run uORF-Tools with some ribo-seq maize data. I am a beginner when it comes to coding. I am using the LSF cluster to run it, so I modified my scripts according to that to the best of my abilities. The error I get when I run the script is:

Building DAG of jobs...
MissingInputException in line 4 of /group/lorri/uorfs/uORFtools/rules/bootstrap.smk:
Missing input files for rule rawbamlink:
SRR18584242.unique.bam

Based on the error message, I assuming the mistake is somewhere in the first few lines of the bootstrap.smk file. Here are the first few lines of that file:

def getbam(wildcards):
    return samples.loc[(wildcards.method, wildcards.condition, wildcards.replicate), ["/group/lorri/uorfs/uORFtools/inputFile"]].dropna()

rule rawbamlink:
    input:
	bams=getbam
    output:
	"/group/lorri/uorfs/uORFtools/maplink/{method}-{condition}-{replicate}.bam"
    params:
	inlink=lambda wildcards, input:(os.getcwd() + "/" + (os.path.splitext(input.bams[0])[0]) + ".bam"),
        outlink=lambda wildcards, output:(os.getcwd() + "/" + str(output))
    threads: 1
    shell:
	"mkdir -p /group/lorri/uorfs/uORFtools/maplink; ln -s {params.inlink} {params.outlink}"

Thanks for reading.

module 'collections' has no attribute 'Iterable'

Hi, I am very excited about using uORF-tools along with snakemake and conda. I have followed the instruction for installation on ReadMe.md and the User Guide. While running

% snakemake --use-conda -s uORF-Tools/Snakefile --configfile uORF-Tools/config.yaml --directory ${PWD} -j 20 --latency-wait 60

I've encountered the error message:

AttributeError in line 33 of /Users/cwon2/CompBio/uORFs-project/uORF-Tools/Snakefile:
module 'collections' has no attribute 'Iterable'
  File "/Users/cwon2/CompBio/uORFs-project/uORF-Tools/Snakefile", line 33, in <module>

It comes to my attention that the current conda(4.10) ships the latest version of Python (3.10). Since Iterable is no longer supported by Python 3.10 and creating environment of snakemake 5.4.5 would create the error above. If snakemake version 5.4.5 is the only version you recommend to run uORF-Tools, perhaps mention it in ReadMe.md and suggest creating the snakemake environment with Python3.7: conda create -c conda-forge -c bioconda -n snakemake snakemake==5.4.5 python==3.7.11

Thanks,
Chao-Jen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.