Code Monkey home page Code Monkey logo

aga's Introduction

Annotated Genome Aligner (AGA)

AGA computes the optimal pairwise alignment of a nucleic acid sequence against a reference genome, taking into account all CDS annotations of the reference to simultaneously score nucleic acid and amino acid similarity, minimize frameshifts, and avoid codon misaligned gaps.

It outputs the optimal nucleic acid sequence alignment and all CDS and protein amino acid sequence alignments.

Web version

A web version is available at http://www.genomedetective.com/app/aga

Build

To compile AGA from source, you need a standard-compliant C++11 compiler, and CMake.

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ../
make

Usage

As input, AGA requires a reference genome and a query sequence.

The reference genome is provided as a GENBANK flatfile record. For example, go to https://www.ncbi.nlm.nih.gov/nuccore/NC_001802.1, choose Send to:, File, Format Genbank.

The query sequence is provided in a FASTA file.

  build/src/aga {OPTIONS} [REFERENCE.GB] [QUERY.FASTA] [ALIGNMENT.FASTA]

    This is AGA, an Annotated Genome Aligner, (c) Emweb bvba
    See http://github.com/emweb/aga/LICENSE.txt for terms of use.

  OPTIONS:

      --help                            Display this help menu
      --version                         Display the version
      Alignment mode, specify one of:
        --global                          Global alignment
        --local                           Local alignment
      Nucleic Acid Score options
        --nt-weight=[WEIGHT]              Weight for NT score fraction
                                          (default=1)
        --nt-gap-open=[COST]              Nucleotide Gap Open penalty
                                          (default=-10)
        --nt-gap-extend=[COST]            Nucleotide Gap Extension penalty
                                          (default=-1)
        --nt-match=[SCORE]                Score for a nucleotide match
                                          (default=2)
        --nt-mismatch=[COST]              Penalty for a nucleotide mismatch
                                          (default=-2)
      Amino Acid Score options
        --aa-weight=[WEIGHT]              Total weight for AA score fraction
                                          (default=1)
        --aa-gap-open=[COST]              Amino Acid Gap Open penalty
                                          (default=-6)
        --aa-gap-extend=[COST]            Amino Acid Gap Extension penalty
                                          (default=-2)
        --aa-matrix=[MATRIX]              Substitution matrix for amino acid
                                          matches: BLOSUM62 or BLOSUM30
                                          (default=BLOSUM30)
        --aa-frameshift=[COST]            Frameshift penalty (default=-100)
        --aa-misalign=[COST]              Codon misalignment penalty
                                          (default=-20)
      General alignment options
        --strict-codon-boundaries         Do not optimize at codon boundaries
      Amino acid alignments output
        --cds-aa-alignments=[ALIGNMENT.FASTA]
                                          Amino acid alignments output file of
                                          CDS (FASTA)
        --cds-nt-alignments=[ALIGNMENT.FASTA]
                                          Nucleic acid CDS alignments output
                                          file of CDS (FASTA)
        --protein-aa-alignments=[ALIGNMENT.FASTA]
                                          Amino acid alignments output file of
                                          Protein Products (FASTA)
        --protein-nt-alignments=[ALIGNMENT.FASTA]
                                          Nucleic acid CDS alignments output
                                          file of Protein Products (FASTA)
      REFERENCE.GB                      Annotated reference (Genbank Record)
      QUERY.FASTA                       FASTA file with nucleic acid query
                                        sequence
      ALIGNMENT.FASTA                   Nucleic acid alignment output file
                                        (FASTA)
      "--" can be used to terminate flag options and force all following
      arguments to be treated as positional options

    AGA will compute the optimal pairwise alignment of a nucleic acid query
    sequence (QUERY.FASTA) against a reference genome (REFERENCE.GB), taking
    into account CDS annotations in the genbank record to include in the
    alignment score all amino acid alignments and minimizing frameshifts within
    these open reading frames. It writes the resulting alignment to
    ALIGNMENT.FASTA

Example

aga --global NC_001802.gb query.fasta alignment.fasta

License

This project is licensed under the Emweb Non-Commercial Public License. See the LICENSE.txt file for details.

aga's People

Contributors

kdeforche-blooloc avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.