Code Monkey home page Code Monkey logo

mitofree's Introduction

Mitofree

THIS REPO HAS BEEN ARCHIVED

You can use the Zenodo DOI to cite this code:

DOI

Docker image available here


A pipeline for automated mitochondrial genome assembly using public data.

Dependencies:

All of the above dependencies can be easily installed through Bioconda. However, due to dependency conflicts (between Python 2 and 3, for instance), manual creation of a conda environment for Mitofree can be a little tricky. Thus, we encourage using the docker image to run this software. Even though this README has been written to be as accessible as possible, it is highly recommended to learn a bit about docker if you're not familiar with it.

Running Mitofree with docker:

1 - Install docker

2 - Download the latest image:

docker pull gavieira/mitofree:latest

3- Use the following command to generate a container:

docker run --name mitofree -i -t -v ~:/mnt -w /mnt gavieira/mitofree /bin/bash

OBS: The contiainer has been created. Thus, the next time you need to run Mitofree, you can skip previous steps by simply starting the container and going straight to step 4. To start the container, run:

docker start -i mitofree

4- Finally, run Mitofree:

Basic usage:

nohup mitofree.py dataset_list.txt >mitofree.out 2>mitofree.err &

Mitofree's help message:

usage: mitofree.py [-h] [-S] [-M] [--novop_kmer] [--mitob_kmer] [-g] [-s] [-T]
                   FILENAME

Downloads sra NGS data and assembles mitochondrial contigs using NOVOPlasty
and MITObim

positional arguments:
  FILENAME           Path to file with multiple accessions (one per line)

optional arguments:
  -h, --help         show this help message and exit
  -S, --savespace    Automatically removes residual assembly files such as
                     fastq and mitobim iterations
  -M , --maxmemory   Limit of RAM usage for NOVOPlasty. Default: no limit
  --novop_kmer       K-mer used in NOVOPlasty assembly. Default: 39
  --mitob_kmer       K-mer used in MITObim assembly. Default: 73
  -g , --gencode     Genetic code table. Default: 2 (Vertebrate Mitochondrial)
  -s , --subset      Max number of reads used in the assembly process.
                     Default: 50 million reads
  -T , --timeout     Custom timeout for MITObim, in hours. Default: 24h

Please note the -M "--maxmemory" argument, that limits NOVOPlasty's RAM usage (in GB). If you are running this software from a machine with limited RAM available, you will want to set this option so that it won't use all your memory. For instance, if you have a 8GB computer, you may want to use "-M 7".

The -s "--subset" argument can be used to limit dataset size, which can also reduce RAM requirements. This argument can also be used to increase dataset size, which may be useful if you're having trouble in circularizing a mitogenome and got some RAM to spare.

Example of Mitofree's input file:

Basically, this file consists of three tab-separated collumns, each with a specific information:

1-SRA_RUN_NUMBER 2-SPECIES_NAME 3-SEED_GENBANK_ACCESSION

For instance:

ERR1306022	Species1	MK297287
ERR7295165	Species2	MK297241
ERR1306034	Species3	MK291745
#SRR4409513	Species4	MK291678 #This assembly will be skipped

Each line corresponds to a different assembly. This way, you can build a list of as many organisms as you want and assemble their mitogenomes all at once. It is also possible to skip an assembly by adding a hash symbol (#) at the start of its corresponding line.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.