The usat from raydebashree

Description

USAT (Unified Score-based Association Test) uses a data-adaptive weighted score-based test statistic for testing association of multiple continuous phenotypes with a single genetic marker. The R function usat implements this association test. For details of this statistical method, please refer/cite:

Ray, D., Pankow, J.S., Basu, S. "USAT: A Unified Score-based Association Test for Multiple Phenotype-Genotype Analysis". Genetic Epidemiology, 40(1):20-34, 2016. PMID: 26638693

Key Words: GWAS; MANOVA; Multiple phenotypes; Multivariate analysis; Pleiotropy; Score test

Requirements

R (>= 3.0.1), CompQuadForm, minqa, survey

How to Install within R

require(devtools)
source_url("https://github.com/RayDebashree/USAT/blob/master/usat_v1.21.R?raw=TRUE")

It is recommended to download/copy the stand-alone R program in this repository, save it in your local directory of choice and source() it from your local directory. When a new version of the software is available, older versions may be removed from this repository, and the above devtools::source_url() technique may not work.

Changes

Version 1.21 - December 12, 2016

An updated version of the software with more user controls.

Version 1.1 - April 07, 2016

First public release of the software.

Usage

Simple example

usat(Y, X, COV=NULL, na.check=TRUE, na.check.msg=TRUE, manova.out=FALSE, AbsTol=.Machine$double.epsˆ0.8)

Arguments

Input	Description
`Y`	The `nxK` phenotype matrix, where `n` is the number of individuals and `K` is the number of phenotypes. The joint association of all `K` phenotypes with the single marker will be tested. `Y` needs to be in R matrix format.
`X`	The `nx1` column matrix for the single genetic marker, where `n` is the number of individuals. `X` needs to be in R matrix format.
`COV`	The `nxq` matrix of covariates that need to be adjusted in the model. `q` is the number of such covariates. `COV` needs to be in R matrix format. The default value is NULL, i.e., it is assumed there is no covariate in the model.
`na.check`	If value is TRUE (default), the code will check for presence of missing values (coded as `NA`). USAT requires complete observations and any individual with at least one missing value in either `Y`, `X` or `COV` will be removed. Removal of missing observations may substantially reduce the sample size `n` and hence power to detect association. If a substantial proportion of the individuals have missing data, it is recommended (if possible) to impute the missing values before using USAT.
`na.check.msg`	If value is TRUE (default), user will receive message updates when presence of missing values (coded as `NA`) is checked.
`manova.out`	If value is FALSE (default), MANOVA statistic and p-value will not be included in the final output.
`AbsTol`	The user can specify the absolute tolerance value used in the numerical integration for evaluating USAT p-values. Default value is 3e-13. `integrate()` function is used for numerical integration.

Value

Output	Description
`T.usat`	The value of the USAT test statistic (scalar).
`omg.opt`	The optimal weight $\omega$ based on a grid search over [0, 1].
`p.usat`	The p-value of association based on the USAT statistic.
`n.obs`	Number of individuals (with complete observations) used for testing association.
`T.manova`	The value of the MANOVA test statistic (scalar). Provided if `manova.out=TRUE`.
`p.manova`	The p-value of association based on MANOVA statistic. Provided if `manova.out=TRUE`.

A Working Example

source("usat_v1.21.R")

# simulate 2 phenotypes on 1000 individuals
library(MASS) # needed for multivariate normal simulation
Y<-mvrnorm(n=1000, mu=c(0,0), Sigma=matrix(c(1,0.2,0.2,1),2,2))

# simulate a single marker for 1000 individuals
X<-matrix(rbinom(n=1000, size=2, prob=0.2), ncol=1) # additive model

# apply USAT to test association
u.out<-usat(Y=Y, X=X, COV=NULL, na.check=FALSE)

# USAT test statistic and p-value
t<-u.out$T.usat
p<-u.out$p.usat

Notes

The method USAT and its software is designed for multiple continuous phenotypes from a random sample. If the ascertainment of individuals in the sample is non-random (e.g., in case-control retrospective study design), it is advisable to account for the sampling scheme (e.g., adjusting the sampling variable as a covariate) when using USAT. One may also use methods and tools designed specifically for the analysis of secondary phenotypes. We proposed one such method (POM-PS) and its software is publicly available.
- Caution: USAT is a single-variant association test; so not expected to work well for rare variants (i.e., genetic variants with very low allele-frequencies).
Although USAT software can adjust for covariates, it is advisable to perform covariate adjustments and necessary transformations (e.g., inverse-normal transformation) on the traits and then apply USAT on the residuals. This will speed up total computation time.
The method USAT and its software is designed for unrelated individuals. If you have two cohorts with overlapping samples and you want to analyse the combined sample, it is desirable to exclude the overlapping individuals, and any related individuals.
If you receive an error like the integral is probably divergent, try reducing the absolute tolerance parameter AbsTol.
If you wish to test genetic association of multiple traits (categorical and/or continuous) and you have access to summary statistics only, please use the new software metaUSAT that is publicly available now!
If removal of related individuals is a concern (e.g., sample size gets greatly reduced) or if there is cryptic relatedness that needs to be accounted for, it is advisable to first obtain the GWAS summary statistics for each trait by analyzing each trait separately accounting for the relatedness among the individuals (e.g., using EPACTS to analyze each trait). Then apply metaUSAT on the single trait GWAS summary statistics.
USAT, like most other cross-phenotype association methods, assumes multivariate normality of the underlying traits. While transformations like rank-based inverse-normalization help induce univariate normality of each trait, it does not guarantee multivariate normality of the joint distribution of the traits. More on this coming up soon!

raydebashree / usat Goto Github PK

usat's Introduction

Description

Requirements

How to Install within R

Changes

Usage

Simple example

Arguments

Value

A Working Example

Notes

usat's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent