Code Monkey home page Code Monkey logo

Comments (9)

xihaoli avatar xihaoli commented on July 23, 2024

Hi Eugene,
Thanks for your message. This error seems to be caused by some numerical issues when fitting the null model. One quick consideration is to use the fitNullModel() function from the GENESIS package to fit the null model (with the current GRM), and then use the genesis2staar_nullmodel() function from the STAARpipeline package to convert the GENESIS null model object to a STAAR null model object. Both STAAR and GENESIS share the same statistical null model when performing association tests.
Please feel free to let me know if this works.
Best,
Xihao

from staar.

eugenegardner avatar eugenegardner commented on July 23, 2024

Hello Xihao,

Thank you for the quick reply. Unfortunately, when I implemented the null model via GENESIS as suggested:

obj_nullmodel <- fitNullModel(x = data_for_STAAR, outcome = pheno_name, covars = covariates, cov.mat = sparse_kinship, family = "gaussian", rescale = "none")

I get a very similar error:

Computing Variance Component Estimates...
Sigma^2_A     log-lik     RSS
[1]  4.958985e-01  4.958985e-01 -5.903154e+05  1.313380e+00
[1]  9.029733e-01  4.702472e-01 -5.901148e+05  1.061299e+00
Error in .local(x, ...) : 
  internal_chm_factor: Cholesky factorization failed
In addition: Warning message:
In .local(x, ...) :
  Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 430

Any suggestions?

I would hope this is very easy to reproduce. I simply acquired the sparse matrix (.rel) from The UK Biobank website, and converted it into a symmetric dsTMatrix from the Matrix package. I then limited it to individuals included in the covariate table (given to 'x' above).

from staar.

xihaoli avatar xihaoli commented on July 23, 2024

Hi Eugene,

Thank you very much for letting me know.

In this case, I think you could try some different thresholds of the sparse GRM by gradually increasing the threshold from 0.0442 to something slightly higher and see if the null model would fit. For this route, it would be ideal to keep all pairwise (estimated) relatedness within a cluster, even if they are below the threshold, rather than setting all entries below the threshold to 0 in the sparse GRM. To do this, the fit_null_glmmkin() function from the STAAR package has a parameter thresh to convert an input dense GRM to a sparse GRM as defined above and then fit the null model. Equivalently, you could use the makeSparseMatrix() function from the GENESIS package to directly operate on the GRM before fitting the null model.

Hope this helps!

Best,
Xihao

from staar.

eugenegardner avatar eugenegardner commented on July 23, 2024

Hi Xihao,

I don't think there is a 'thresh' parameter? Did you mean kins_cutoff?

I will try the other method as well.

from staar.

eugenegardner avatar eugenegardner commented on July 23, 2024

Hi Xihao,

I tried both approaches and still got the same error. Is markSparseMatrix() really the solution given that the matrix is already sparse?

I used a slightly different (external) approach to threshold and it seemed to work:

sparse_kinship@x <- sapply(sparse_kinship@x, function(x) if_else(x < 0.05, 0.05, x))

Where 0.05 is the threshold. This seems a bit like a suboptimal solution since it misrepresents the relatedness between individuals. Did I misunderstand your comment of:

For this route, it would be ideal to keep all pairwise (estimated) relatedness within a cluster

?

from staar.

xihaoli avatar xihaoli commented on July 23, 2024

Hi Eugene,

  1. My apologies, the thresh parameter refers to kins_cutoff in the fit_null_glmmkin() function;
  2. Your understanding is exactly correct. The approach that "keeps all pairwise relatedness within a cluster, even if they are below the threshold" is considered to be more optimal than the "direct thresholding" approach, since the latter misrepresents the relatedness between individuals;
  3. The makeSparseMatrix() function implements the "optimal" approach as described above, however, it assumes a dense GRM as input. To do this, you could first convert the sparse GRM back to a dense GRM and use that as the input of either the fit_null_glmmkin() function with different values of kins_cutoff (or the makeSparseMatrix() function, with different values of thresh).

Hope this is more clear.

Best,
Xihao

from staar.

eugenegardner avatar eugenegardner commented on July 23, 2024

Hello Xihao,

I've recently noticed via testing additional phenotypes that this seems very dependent on the case/control imbalance of the phenotype being studied. In a set of ~200,000 individuals, phenotypes with a prevalence of < ~0.2% seem to always run into the Cholesky issue I mentioned in my original post. Just wondering if you have any advice on how to handle this as the thresholding issue now seems suboptimal when it is not possible to know a priori what the threshold should be in the first place.

from staar.

xihaoli avatar xihaoli commented on July 23, 2024

Hi Eugene,
Thanks for letting me know. We will take a look at this and get back to you.
Best,
Xihao

from staar.

hanchenphd avatar hanchenphd commented on July 23, 2024

Hello Xihao,

I've recently noticed via testing additional phenotypes that this seems very dependent on the case/control imbalance of the phenotype being studied. In a set of ~200,000 individuals, phenotypes with a prevalence of < ~0.2% seem to always run into the Cholesky issue I mentioned in my original post. Just wondering if you have any advice on how to handle this as the thresholding issue now seems suboptimal when it is not possible to know a priori what the threshold should be in the first place.

Did you always see the Cholesky issue in this subset (regardless of which phenotypes), or just sometimes? The kinship coefficients provided by the UK Biobank do not give a positive definite kinship matrix (due to the cut-off at 0.0442), but this may or may not be the issue for you, depending on which subset you were using. If the kinship matrix for your subset is not positive definite, I think there was really just one block of a few hundred individuals that caused the problem. You could drop these individuals to avoid the Cholesky issue. A better solution would be the UK Biobank fixes this block of individuals by filling in values that have been zeroed out (e.g., 0.0441) to make it positive definite, and release a new kinship matrix.

If you only saw this issue sometimes, and got a counter-example that successfully converged in the same subset (with a different or a simulated phenotype), then it was likely a convergence problem related to your case/control phenotypes.

Best,
Han

from staar.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.