Code Monkey home page Code Monkey logo

snprelate's Introduction

SNPRelate: Parallel computing toolset for relatedness and principal component analysis of SNP data

GPLv3 GNU General Public License, GPLv3

Availability Years-in-BioC Build Status Build status Comparison is done across all Bioconductor packages over the last 6 months codecov.io

Features

Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed SNPRelate (R package for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized.

The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. The SNP GDS format in this package is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variation (SNV), insertion/deletion polymorphism (indel) and structural variation calls.

Bioconductor:

Release Version: v1.12.0

http://www.bioconductor.org/packages/release/bioc/html/SNPRelate.html

Development Version: v1.13.0

http://www.bioconductor.org/packages/devel/bioc/html/SNPRelate.html

News

Tutorials

http://corearray.sourceforge.net/tutorials/SNPRelate

http://www.bioconductor.org/packages/devel/bioc/vignettes/SNPRelate/inst/doc/SNPRelateTutorial.html

Citation

Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012). A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics. DOI: 10.1093/bioinformatics/bts606.

Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray -- A storage-efficient high-performance data format for WGS variant calls. Bioinformatics. DOI: 10.1093/bioinformatics/btx145.

Installation

  • Bioconductor repository:
source("http://bioconductor.org/biocLite.R")
biocLite("SNPRelate")
  • Development version from Github:
library("devtools")
install_github("zhengxwen/gdsfmt")
install_github("zhengxwen/SNPRelate")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

wget --no-check-certificate https://github.com/zhengxwen/gdsfmt/tarball/master -O gdsfmt_latest.tar.gz
wget --no-check-certificate https://github.com/zhengxwen/SNPRelate/tarball/master -O SNPRelate_latest.tar.gz
R CMD INSTALL gdsfmt_latest.tar.gz
R CMD INSTALL SNPRelate_latest.tar.gz

## Or
curl -L https://github.com/zhengxwen/gdsfmt/tarball/master/ -o gdsfmt_latest.tar.gz
curl -L https://github.com/zhengxwen/SNPRelate/tarball/master/ -o SNPRelate_latest.tar.gz
R CMD INSTALL gdsfmt_latest.tar.gz
R CMD INSTALL SNPRelate_latest.tar.gz

Implementation with Intel Intrinsics

Implementation Table:

Function No SIMD SSE2 AVX AVX2 AVX-512
snpgdsDiss » X
snpgdsEIGMIX » X X X
snpgdsGRM » X X X
snpgdsIBDKING » X X X
snpgdsIBDMoM » X
snpgdsIBS » X X
snpgdsIBSNum » X X
snpgdsIndivBeta » X X P X
snpgdsPCA » X X X
snpgdsPCACorr » X
snpgdsPCASampLoading » X
snpgdsPCASNPLoading » X
...

X: fully supported; .: partially supported; P: POPCNT instruction.

Install the package from the source code with the support of Intel SIMD Intrinsics:

You have to customize the package compilation, see: CRAN: Customizing-package-compilation

Change ~/.R/Makevars to, assuming GNU Compilers (gcc/g++) or Clang compiler (clang++) are installed:

## for C code
CFLAGS=-g -O2 -march=native -mtune=native
## for C++ code
CXXFLAGS=-g -O2 -march=native -mtune=native

Implementation with OpenCL

In progress ...

snprelate's People

Contributors

zhengxwen avatar niknakk avatar smgogarten avatar

Stargazers

sandy avatar

Watchers

James Cloos avatar TOM YAN avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.