Code Monkey home page Code Monkey logo

lcvcftools's Introduction

LCVCFtools

LCVCFtools is a simple C++ program designed for working with VCF v4.2 files generated from low-coverage whole genome sequencing. This tools is useful for data optimization before genotype imputation steps by removing samples or variantes with excessive missing data.

Compiling

You can easily compile this program using LCVCFtools.pro file and qmake tool, just run the following command:

sudo apt install qt5-qmake libboost-all-dev libz-dev  
git clone https://github.com/marcusnizalvarez/LCVCFtools.git  
cd LCVCFtools/  
qmake && make

Usage

Input mode

  • --vcf Read from VCF file. Use - to read from stdin.
  • --gzvcf Read from Gzip compressed VCF file. Use - to read from stdin.

Filter parameters

  • --minGQ Minimum genotype quality in PhredScale. [Default=20]
  • --minDP Minimum depth. [Default=5]
  • --MAF Minor allele frequency, based on allele depth (AD). [Default=0.1]
  • --minGCR Minimum genotype call rate [Default=0].
  • --minDPR Minimum DP rate. Can be defined multiple times.
  • --minGQR Minimum GQ rate. Can be defined multiple times.

Other arguments

  • --remove Remove samples listed in a file.
  • --keep Keep samples listed in a file, after --remove.
  • --sample-stats Output sample statistics to 'stats.tsv'.
  • --keep-multiallelic Don't skip multiallelic variants.
  • --ID Generate generic ID, useful for programs like Plink.
  • --verbose Verbose mode.
  • --help Print this message.

Usage example

./LCVCFtools --gzvcf example.vcf.gz - --minGQ 20 --minDP 5 --minGCR 0.25 --minDPR 5 0.5 --MAF 0.1 --sample-stats | gzip -c > output.vcf.gz

Citation

Cite as: Alvarez MVN. LCVCFtools v1.0.2‑alpha. 2022. https://doi.org/10.5281/zenodo.5259931.

This software was originally developed for this paper: Alvarez, MVN. et al. Nyssorhynchus darlingi genome-wide studies related to microgeographic dispersion and blood-seeking behavior. Parasites & Vectors. 2022. 15(1):106.

Credits

Author: Marcus Vinicius Niz Alvarez
Email: [email protected]
São Paulo State University, UNESP - Biotechnology Institute and Bioscience Institute, Botucatu, 18618-689, Brazil.

License

LCVCFtools is under GNU GPLv3.0 license and Boost Library is under Boost Software License v1.0.

lcvcftools's People

Contributors

marcusnizalvarez avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.