Code Monkey home page Code Monkey logo

viprs's Introduction

viprs: Variational Inference of Polygenic Risk Scores

PyPI pyversions PyPI version fury.io License: MIT

Linux CI MacOS CI Windows CI Docs Build Binary wheels

Downloads Downloads

viprs is a python package that implements variational inference techniques to estimate the posterior distribution of variant effect sizes conditional on the GWAS summary statistics. The package is designed to be fast and accurate, and to provide a variety of options for the user to customize the inference process. Highlighted features:

  • The coordinate ascent algorithms are written in C/C++ and cython for improved speed and efficiency.
  • The code is written in object-oriented form, allowing the user to extend and experiment with existing implementations.
  • Different priors on the effect size: Spike-and-slab, Sparse mixture, etc.
  • We also provide scripts for different hyperparameter tuning strategies, including: Grid search, Bayesian optimization, Bayesian model averaging.
  • Easy and straightforward interfaces for computing PRS from fitted models.
  • Implementation for a wide variety of evaluation metrics for both binary and continuous phenotypes.

Helpful links

viprs's People

Contributors

shz9 avatar

Stargazers

Anggi avatar  avatar Christian Coffman avatar Oleksandr Frei avatar Frederick Boehm avatar  avatar Chen-Yang Su avatar Nicholas Knoblauch avatar Eric Wooten avatar Lian Jie avatar James Pirruccello avatar J. Lucas Boatwright avatar Peter Carbonetto avatar Marc-André Legault avatar Adrien Osakwe avatar Doruk Cakmakci avatar

Watchers

Elliot Layne avatar  avatar Doruk Cakmakci avatar

Forkers

dorukcakmakci

viprs's Issues

Added pseudo R-squared metrics to evaluation

Many papers/researchers report pseudo R-squared metrics for case/control traits, such as the Nagelkerke R2 for binary traits. It may be useful to add implementation for this (and other related metrics) in eval.metrics module.

Here are a set of tasks to do:

  • Add implementation for the Nagelkerke R2
  • Add implementation for the Cox & Snell R2
  • Add implementation for the McFadden R2

Here's a reference that may useful for the implementation: https://web.pdx.edu/~newsomj/cdaclass/ho_logistic.pdf

Invalid LD Matrix: Element 0 does not have matching LD boundaries!

Sorry to disturb!
I was trying to construct a shrinkage LD matrix using my genotype file. Here is my script:

import magenpy as mgp
gdl = mgp.GWADataLoader("./LD_reference/plink/plink",
                backend='plink')
gdl.compute_ld(estimator='shrinkage',
                genetic_map_ne=11400,
                genetic_map_sample_size=183,
                output_dir='./LD_reference/EUR/VIPRS')

I encountered the following error:

> Reading BED file...
Computing LD matrices:   0%|                                      | 0/22 [00:00<?, ?it/s]/software/conda/envs/wdl/lib/python3.7/site-packages/scipy/sparse/_index.py:125: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  self._set_arrayXarray(i, j, x)
Traceback (most recent call last):
  File "VIPRS_EURLD.py", line 7, in <module>
    output_dir='./LD_reference/EUR/VIPRS')
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/GWADataLoader.py", line 563, in compute_ld
    disable=not self.verbose or len(self.genotype) < 2)
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/GWADataLoader.py", line 560, in <dictcomp>
    for c, g in tqdm(sorted(self.genotype.items(), key=lambda x: x[0]),
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/ma
genpy/GenotypeMatrix.py", line 260, in compute_ld
    return ld_est.compute(output_dir, temp_dir=tmp_ld_dir.name)
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/stats/ld/estimator.py", line 227, in compute
    temp_dir)
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/stats/ld/estimator.py", line 90, in compute
    if _validate_ld_matrix(ld_mat):
  File "/software/conda/envs/wdl/lib/python3.7/site-packages/magenpy/stats/ld/utils.py", line 45, in _validate_ld_matrix
    raise ValueError(f"Invalid LD Matrix: Element {i} does not have matching LD boundaries!")
ValueError: Invalid LD Matrix: Element 0 does not have matching LD boundaries!
Computing LD matrices:   0%|                                      | 0/22 [01:07<?, ?it/s]

Do you know what caused this error, and how to fix it? If you need additional information, please feel free to ask me.
Thank you so much in advance!

Does not execute.

Traceback (most recent call last):
File "viprs.py", line 2, in
import viprs as vp
File "/home/muneeb/Desktop/PRSTOOLS/PRS playground/viprs.py", line 17, in
v = vp.VIPRS(gdl)
AttributeError: partially initialized module 'viprs' has no attribute 'VIPRS' (most likely due to a circular import)

AttributeError: module 'collections' has no attribute 'Iterable'

When I try to run the test example given on the project page (specifically calling function "mgp.GWADataLoader"

gdl = mgp.GWADataLoader(bed_files=mgp.tgp_eur_data_path(), sumstats_files=mgp.ukb_height_fastGWA_path(), sumstats_format="fastGWA")

gives an error - AttributeError: module 'collections' has no attribute 'Iterable'

I am using python3.10 (22.04.1-Ubuntu) and installed viprs using pip as suggested in the manual.

Explore acceleration methods for Variational EM

In RSS-E, Zhu and Stephens (Supplementary material) employ an acceleration scheme to speed up the convergence of the variational EM algorithm.

We would like to implement similar solutions for VIPRS to reduce the number of iterations required for convergence.

Here are a number of useful resources that may be helpful in this setting:

[1] Alternating cyclic extrapolation methods for optimization algorithms
Nicolas Lepage-Saucier

[2] SQUAREM: Squared Extrapolation Methods for Accelerating EM-Like Monotone Algorithms

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.