Code Monkey home page Code Monkey logo

asqcan's Introduction

asqcan

A combined pipeline for bacterial genome assembly, quality control and annotation

asqcan is a workflow pipeline for the automated assembly, quality control and annotation of bacterial genome sequences. Modern bacterial sequencing projects can involve a significant number of isolates and the process of assembling, running necessary QC and annotation can be time consuming. The asqcan pipeline seeks to automate this as much as possible. The current steps asqcan takes are:

  1. Quality analysis of raw reads with FastQC
  2. Genome assembly with spades
  3. Quality analysis of assemblies with quast
  4. Contamination and quality analysis of assemblies with blobtools
  5. Annotation of assemblies using prokka

The asqcan pipleine runs these five steps on each .fastq or .fastq.gz reads file in the directory provided by the -i option. When asqcan completes, it generates a report on the success or failure of each step of the pipline (asqcan_rport.tsv). Successful steps will not be rerun on a subsequent execution, i.e. asqcan will detect successful steps and ignore them in future runs.

Requirements

asqcan requires a linux-based system and the following:

Installation

To download and install asqcan with all dependencies use conda:

conda install -c conda-forge -c bioconda asqcan

or pip (requires manual dependency installation):

pip install git+https://github.com/bogemad/asqcan.git

or a manual install (again this requires you to manually install dependencies):

git clone https://github.com/bogemad/asqcan.git
cd asqcan
python setup.py install

Usage

usage: asqcan [-h] -q READS_DIR -o OUTDIR [-b DB] [-t THREADS] [-f]
              [--version] [-v]

asqcan - A combined pipeline for bacterial genome ASsembly, Quality Control,
and ANnotation.

required arguments:
  -q READS_DIR, --fastq-dir READS_DIR
                        Path to a directory with your interleaved fastq files.
  -o OUTDIR, --output-directory OUTDIR
                        Path to the output directory. A directory will be
                        created if one does not exist.

optional arguments:
  -h, --help            show this help message and exit
  -b DB, --blast_database DB
                        Path to the local nt blast database. This pipeline
                        does not require you to have a local copy of the nt
                        database but without it you will not be able to use
                        similarity data for blobtools. Similarity data adds
                        significantly to the blobplot and blobtools table
                        outputs of this pipeline. See https://blast.ncbi.nlm.n
                        ih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=
                        Download to install a local nt database.
  -t THREADS, --threads THREADS
                        Number of threads to use for multiprocessing.
  -f, --force           Overwrite files in the output directories.
  --version             show program's version number and exit
  -v, --verbose         Increase verbosity on command line output (n.b.
                        verbose output is always saved to asqcan.log in the
                        output directory).

asqcan's People

Contributors

bogemad avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.