The orfmine from radusuciu

Overview

Recent studies attribute a new role to the noncoding genome in the production of novel peptides. The widespread transcription of noncoding regions and the pervasive translation of the resulting RNAs offer a vast reservoir of novel peptides to the organisms.

ORFmine is an open-source package that aims at extracting, annotating, and characterizing the fold potential and the structural properties of all Open Reading Frames (ORF) encoded in a genome (including coding and noncoding sequences). ORFmine consists of two independent programs, ORFtrack and ORFold that can be used together or independently (see here for an example of application).

Both these tools have been developed in python3 (version >= 3.6). The install.sh script will install both ORFtrack and ORFold with their dependancies. They can be used together or independently.

Documentation

You will find complete documentation on https://i2bc.github.io/ORFmine/

Installation

1. Download and uncompress the latest release

You can clone the whole project with the following command:

git clone https://github.com/i2bc/ORFmine.git

Alternatively, you can click here to access the latest release.

Then uncompress the archive. If you downloaded:

the .zip file: unzip ORFmine-x.x.x.zip
the .tar.gz file: tar xzvf ORFmine-x.x.x.tar.gz

2. Create an isolated environment

Although not strictly necessary, this step is highly recommended (it will allow you to work on different projects without having any conflicting library versions). If you do not want to create a virtual environment, please go directly to the install section.

Install virtualenv

python3 -m pip install virtualenv

Create a virtual environment

virtualenv -p python3 orfmine_env

Activate the created environment

source orfmine_env/bin/activate

Once activated, any python library you will install using pip will be installed solely in this isolated environment. You must activate this environment any time you need libraries installed in this environment.

Once you are done working on your project, simply type deactivate to exit the environment.

Note

To delete definitely your virutal environment, you can simply remove the directory with the following instruction: rm -r orfmine_env/

Note

We remind to the user that some external packages used in ORFmine (such as Biopython) require python version >= 3.6. Before creating your virtual environment make sure that your python version is up-to-date.

3. Install ORFMine

Preparation before the Installation

Please note that we will refer below to the root directory of ORFmine as ORFmine-x-x-x where x-x-x refers to the version downloaded from an archive file (either .zip or .tar.gz). If you just cloned the project, the root directory of ORFmine will be ORFmine instead.

If you just want to use ORFtrack in order to annotate all the possible ORFs of a genome, you have no other dependencies to install, and you simply have to Launch the Installation presented below.

The installation of ORFold becomes a bit more demanding as there are some external tools to be downloaded and/or installed before launching the installation.

Firstly, ORFold is based on the HCA method for the calcluation of the fold potential. As a result pyHCA [1] is essential to be pre-installed in your machine before installing ORFold. You can download for free and install pyHCA using the instructions of the developers.

If you are not interested in the calculation of the disorder and/or aggregation propensities with ORFold and you already have installed pyHCA, you can simply launch the installation presented below.

However, in the case you want to use IUPred [2][3][4] and/or Tango [5][6][7] with ORFold you have to first contact their developers through the respective links and have access to their programs. These two softwares are not freely available for non-academic users.

Once you have access to the IUPred and Tango you have to place them in a directory called softwares placed in the path: ORFmine-x.x.x/orfold_v1/orfold/. To do so:

First create the softwares directory if not already created:

mkdir ORFmine-x.x.x/orfold_v1/orfold/softwares

Move the IUPred source code and data (provided by the developer):

  mv iupred2a.py ORFmine-x.x.x/orfold_v1/orfold/softwares
  mv iupred2a.py ORFmine-x.x.x/orfold_v1/orfold/softwares
  mv data ORFmine-x.x.x/orfold_v1/orfold/softwares

Move Tango source code:

For MacOS:

  mv tango2_3_1 ORFmine-x.x.x/orfold_v1/orfold/softwares

For linux:

  mv tango_x86_64_release ORFmine-x.x.x/orfold_v1/orfold/softwares

For windows:

  mv Tango.exe ORFmine-x.x.x/orfold_v1/orfold/softwares

Note

The calculation of the disorder or aggregation propensities are both optional and complementary to the HCA score. As a result, IUPred and Tango tools are not mandatory for the installation of ORFold. In addition, they are not necessarily coupled together. ORFold will properly be installed without them or even with only one of them.

Installation

If you use a virtual environment, be sure that your virtual environment is activated. Then, in any case, follow the procedure described below:

cd ORFmine-x.x.x
chmod u+x install.sh
./install.sh

This script will first uninstall ORFmine if it was already installed and will re-install it. In addition, it will install all the dependency packages needed for ORFtrack and ORFold.

References

Bitard-Feildel, T. & Callebaut, I. HCAtk and pyHCA: A Toolkit and Python API for the Hydrophobic Cluster Analysis of Protein Sequences. bioRxiv 249995 (2018).
Dosztanyi, Z., Csizmok, V., Tompa, P. & Simon, I. The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. Journal of molecular biology 347, 827–839 (2005).
Dosztányi, Z. Prediction of protein disorder based on IUPred. Protein Science 27, 331– 340 (2018).
Mészáros, B., Erdős, G. & Dosztányi, Z. IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic acids research 46, W329–W337 (2018).
Fernandez-Escamilla, A.-M., Rousseau, F., Schymkowitz, J. & Serrano, L. Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nature biotechnology 22, 1302–1306 (2004).
Linding, R., Schymkowitz, J., Rousseau, F., Diella, F. & Serrano, L. A comparative study of the relationship between protein structure and β-aggregation in globular and intrinsically disordered proteins. Journal of molecular biology 342, 345–353 (2004).
Rousseau, F., Schymkowitz, J. & Serrano, L. Protein aggregation and amyloidosis: confusion of the kinds? Current opinion in structural biology 16, 118–126 (2006).

radusuciu / orfmine Goto Github PK

orfmine's Introduction

Overview

Documentation

Installation

1. Download and uncompress the latest release

2. Create an isolated environment

Install virtualenv

Create a virtual environment

Activate the created environment

3. Install ORFMine

Preparation before the Installation

Installation

References

orfmine's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent