Code Monkey home page Code Monkey logo

fastqparse's Introduction

FastQParse

Java toolkit for preprocessing .fastq files for the GBS Pipeline

Download FastQParse.jar for the latest compiled binaries.

Description:

FastQParse handles the initial processing of sequenced DNA for the GBS pipeline, which is necessary for base calling, alignment, and downstream analysis. FastQParse integrates many steps in needed when preprocessing into one single command. Single end, paired end, and UMI fastq files are all supported. Different matching algorithms are implemented to maximize the flexibility of the toolkit. Running different features separately allows FastQParse to handle files for other pipelines.

Features:

  • Demultiplex single or paired-end reads (two matching algorithms)
  • Deduplicate (UMI)
  • Support for UMI index files
  • Merge paired-end reads (two matching algorithms)
  • Quality trim (two trimming algorithms)
  • Remove adapters (5' or 3' and anchored or not anchored, two matching algorithms)
  • Gzip input/output
  • Support for different types of mismatches (Hamming, Levenshtein, or Bayesian probability)
  • Remove reads with low average quality or high expected error rate
  • Comprehensive stats when demultiplexing (.stats file)
  • Dump removed reads to other files
  • Do all of the above in one command (can run demultiplexing, quality trimming, adapters trimming, merging paired ends, etc. together)
  • Or use quality trim, remove adapters, merge paired ends, deduplicate, etc. as standalone features (ex. only remove adapters in one file)
  • Parallel processing using multi-threading
  • Efficient and flexible matching using bit-parallel algorithms and probability based matching
  • And more...

Tutorial available here.

List of commands available here.

Algorithms and implementation details available here.

Wiki and other information available here.

fastqparse's People

Contributors

daniel-liu-c0deb0t avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.