Code Monkey home page Code Monkey logo

varia's Introduction

Varia

Varia is a tool to predict var genes based on short 150-200 base pair sequences (like PCR fragments). It is composed of two module Varia VIP and Varia GEM.

To install Varia, and first download the current version, eg.

  1. git clone https://github.com/GCJMackenzie/Varia.git
  2. Move to direcory "cd Varia/Varia1_6"
  3. Next you need to download two files with var genes data. You can obviously provide your own, see manual, but download:
    3a: download vardb_domains.txt.gz from https://github.com/ThomasDOtto/varDB/tree/master/Datasets/Varia/ into the directory domains and unzip it
    Run next: cat vardb_domains.txt | perl -e 'while(){@ar=split(/\t/); chomp($ar[3]); $h{$ar[0]}.=$ar[3]."-"}; foreach $k (keys %h){ print "$k\t$h{$k}\n"}' > vardb_GEM_domains.txt to generate a different version of the domains 3b download mega_var.fasta.gz from https://github.com/ThomasDOtto/varDB/tree/master/Datasets/Varia/ into the directory vardb and unzip it
  4. change the attributes of executable files: chmod 755 *.sh
  5. Run the installation scrip ./Install_Varia.sh. This will install all the needed packages.
  6. Set the path as suggested in the last line of the varia installation script: PATH=$PATH:<...Varia/Varia1_6> export PATH
  7. Finally install vsearch:
    conda install -c bioconda vsearch
    conda install -c bioconda/label/cf201901 vsearch

with Varia.sh VIP -h you should get information how to run the first module.

We tested Varia on a linux and Mac (10.13) enviroment.

#Pre-requisites

Varia is run in a Linux environment. To run module 1, Varia requires the following tools be installed and be included in the user’s path: (The installation script will try to install some of them)
-mcl v12-135: https://micans.org/mcl/
-megablast + formatdb v2.2.26: https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download
-samtools v1.7: http://samtools.sourceforge.net/
-Vsearch 2.14.2
-circos v 0.69-6, perl v 5.022000: http://circos.ca/software/download/circos/

The script Install_Varia.sh, has been included to help check the required tools are installed. Varia has two pipelines, the var identification and prediction, Varia_VIP, and the var gene expression analysis module, (2) Varia_GEM.
#Run the script

Arguments
Varia_VIP is run using the following command line:

Varia.sh [optional arguments] -i [input tag file]

-i is the only mandatory argument required to run Varia_VIP as this specifies the input file to be used. Varia_VIP also has a number of optional arguments, which can be used to change the output directory and change various filters used throughout the module, a detailed list of these options and their default settings can be found in the readme file, or by using:
Varia.sh -h

Databases

Varia is building on existing var gene databases can that be found at:
ftp://ftp.sanger.ac.uk/pub/project/pathogens/Plasmodium/falciparum/PF3K/varDB/FullDataset/
and
https://github.com/ThomasDOtto/varDB

varia's People

Contributors

gcjmackenzie avatar thomasdotto avatar rasmuswjensen avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.