Code Monkey home page Code Monkey logo

gbs-snp-crop's Introduction

GBS-SNP-CROP

Latest release v.4.1 (October 6, 2019)

Introduction

The GBS SNP Calling Reference Optional Pipeline (GBS-SNP-CROP) is executed via a sequence of seven Perl scripts that integrate custom parsing and filtering procedures with well-known, vetted bioinformatic tools, giving the user full access to all intermediate files. By employing a novel strategy of variant (SNPs and indels) calling based on the correspondence of within-individual to across-population patterns of polymorphism, the pipeline is able to identify and distinguish high-confidence variants from both sequencing and PCR errors, whether or not a reference genome is available. In the latter case, the pipeline adopts a clustering strategy to build a population-tailored "Mock Reference" using the same GBS data for downstream calling and genotyping. Designed for libraries of either paired-end (PE) or single-end (SE) reads of arbitrary lengths, GBS-SNP-CROP maximizes data usage by eliminating unnecessary data culling due to imposed length uniformity requirements. GBS-SNP-CROP is a complete bioinformatics pipeline developed primarily to support curation, research, and breeding programs wishing to utilize GBS for the cost-effective genome-wide characterization of plant genetic resources.

Pipeline workflow

Stage 1. Process the raw GBS data

  • Step 1: Parse the raw reads
  • Step 2: Trim based on quality and adaptors
  • Step 3: Demultiplex

Stage 2. Build the Mock Reference

  • Step 4: Cluster reads and assemble the Mock Reference

Stage 3. Map the processed reads and generate standardized alignment files

  • Step 5: Align with BWA-mem and process with SAMtools
  • Step 6: Parse mpileup outputs and produce the variants discovery matrix

Stage 4. Call Variants and Genotypes

  • Step 7: Filter variants and call genotypes

Below is a schematic of the workflow, with inputs and outputs (boxes) indicated for each step (arrows).

Released versions

v.4.1: Released on 10/6/2019
v.4.0: Released on 10/22/2018
v.3.0: Released on 2/8/2018
v.2.0: Released on 2/22/2017
v.1.1: Released on 3/11/2016
v.1.0: Released on 1/12/2016

Getting Help

Begin by carefully going through the GBS-SNP-CROP User manual. Before posting a question or starting a discussion, please first refer to the FAQ page. Also, please check your barcode ID file for empty characters or blank spaces and verify that it was saved as a tab-delimited file. If you're still facing an issue or have suggestions for improving this tool, kindly submit your question or comment to our Google groups page.

Requirements

  • Java 7 or higher - The latest version of GBS-SNP-CROP (v.4.1) was tested using Java 8 (update 221)
  • Trimmomatic Latest version tested using v.0.39 (Bolger et al., 2014)
  • PEAR Latest version tested with v.0.9.11 (Zhang et al., 2014)
  • VSEARCH Latest version tested with v2.13.7 (Rognes et al., 2016)
  • BWA aligner Latest version tested with v.0.7.12 (Li & Durbin, 2009)
  • SAMTools Latest version tested with v.1.7 (Li et al., 2009)
  • The following five CPAN modules also need to be installed: GetOpt::Long, IO::ZLib, List::Util, List::MoreUtils, Parallel::ForkManager

Citing GBS-SNP-CROP

Melo et al. GBS-SNP-CROP: A reference-optional pipeline for SNP discovery and plant germplasm characterization using genotyping-by-sequencing data. BMC Bioinformatics. 2016. 17:29. DOI 10.1186/s12859-016-0879-y.

gbs-snp-crop's People

Contributors

arthurmelobio avatar halelab avatar nathanhaigh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.