Code Monkey home page Code Monkey logo

fapr's Introduction

FapR: Frequency-based amplicon phasing in R

This script performs the phasing of resistance markers from polyclonal infections of Plasmodium falciparum utilizing only allele frequencies. As of now, 7 resistance markers across 4 mad4hatter amplicons from the dhfr and dhps genes are supported.

Dependencies

The following R packages must be installed: ggplot2, dplyr, gridExtra, and optparse.

Usage

Rscript FapR.R -i [resmarker_table_global_max_0_filtered.csv] -o [output_prefix]

Script Overview

Assumptions

  1. All alleles in the input file are true alleles. Appropriate filtering is strongly suggested.
  2. There are no copy number variants (CNV) on the amplicons.

Phasing

FapR uses an iterative approach in which haplotypes are accepted based on:

  1. Probability of occurring in a sample: haplotypes built from highly abundant resmarkers are more likely to be true.
  2. Variance on the resmarker frequencies: haplotypes built from similarly abundant resmarkers are more likely to be true.

algo

Figure 1. FapR's phasing algorithm.

example

Figure 2. Example of the phasing process. The best haplotype on each iteration (highest probability and lowest coefficient of variation) is highlighted in green and its assigned frequency in pink. Haplotypes with zero probability are highlighted in orange. This particular sample resulted in 3 haplotypes that add up to 99%.

Flagging

Phased haplotypes are flagged based on:

  1. Frequency in the sequencing run (assuming it is from a given population)

    • A single threshold derived from population frequency
    • Allows to catch haplotypes that are frequent in the run, but have low abundance in particular samples
    • This flag takes precedence over the following
  2. Limit of detection of each amplicon (experimentally tested)

    • A threshold for each amplicon
    • Allows to catch haplotypes that are rare in the run, but moderate to highly abundant in particular samples
    • Allows to build partial haplotypes

flagging

Figure 3. FapR's flagging algorithm. Currently, population frequency threshold is the mean haplotype frequency.

flagging_example

Figure 4. Example of the flagging results. Checkmarks are accepted haplotypes: green = correctly phased and also frequent in the population/run; blue = correctly phased but rare in the population/run; purple = inconclusive phasing but frequent in the population/run. Orange crosses are haplotypes with inconclusive phasing and also rare (or absent) in the population/run, thus being inconclusive haplotypes.

fapr's People

Contributors

manuelgug avatar

Watchers

 avatar

Forkers

skiyaga

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.