Hello, I am having some issues with running STAAR using a pre-comput

Hi Eugene, Thank you very much for letting me know. <p dir="auto

Hi Eugene, My apologies, the thresh<

Colesky factorization failed about staar HOT 9 CLOSED

xihaoli commented on July 23, 2024

Colesky factorization failed

from staar.

Comments (9)

xihaoli commented on July 23, 2024

Hi Eugene,
Thanks for your message. This error seems to be caused by some numerical issues when fitting the null model. One quick consideration is to use the fitNullModel() function from the GENESIS package to fit the null model (with the current GRM), and then use the genesis2staar_nullmodel() function from the STAARpipeline package to convert the GENESIS null model object to a STAAR null model object. Both STAAR and GENESIS share the same statistical null model when performing association tests.
Please feel free to let me know if this works.
Best,
Xihao

from staar.

eugenegardner commented on July 23, 2024

Hello Xihao,

Thank you for the quick reply. Unfortunately, when I implemented the null model via GENESIS as suggested:

obj_nullmodel <- fitNullModel(x = data_for_STAAR, outcome = pheno_name, covars = covariates, cov.mat = sparse_kinship, family = "gaussian", rescale = "none")

I get a very similar error:

Computing Variance Component Estimates...
Sigma^2_A     log-lik     RSS
[1]  4.958985e-01  4.958985e-01 -5.903154e+05  1.313380e+00
[1]  9.029733e-01  4.702472e-01 -5.901148e+05  1.061299e+00
Error in .local(x, ...) : 
  internal_chm_factor: Cholesky factorization failed
In addition: Warning message:
In .local(x, ...) :
  Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 430

Any suggestions?

I would hope this is very easy to reproduce. I simply acquired the sparse matrix (.rel) from The UK Biobank website, and converted it into a symmetric dsTMatrix from the Matrix package. I then limited it to individuals included in the covariate table (given to 'x' above).

from staar.

xihaoli commented on July 23, 2024

Hi Eugene,

Thank you very much for letting me know.

In this case, I think you could try some different thresholds of the sparse GRM by gradually increasing the threshold from 0.0442 to something slightly higher and see if the null model would fit. For this route, it would be ideal to keep all pairwise (estimated) relatedness within a cluster, even if they are below the threshold, rather than setting all entries below the threshold to 0 in the sparse GRM. To do this, the fit_null_glmmkin() function from the STAAR package has a parameter thresh to convert an input dense GRM to a sparse GRM as defined above and then fit the null model. Equivalently, you could use the makeSparseMatrix() function from the GENESIS package to directly operate on the GRM before fitting the null model.

Hope this helps!

Best,
Xihao

from staar.

eugenegardner commented on July 23, 2024

Hi Xihao,

I don't think there is a 'thresh' parameter? Did you mean kins_cutoff?

I will try the other method as well.

from staar.

eugenegardner commented on July 23, 2024

Hi Xihao,

I tried both approaches and still got the same error. Is markSparseMatrix() really the solution given that the matrix is already sparse?

I used a slightly different (external) approach to threshold and it seemed to work:

sparse_kinship@x <- sapply(sparse_kinship@x, function(x) if_else(x < 0.05, 0.05, x))

Where 0.05 is the threshold. This seems a bit like a suboptimal solution since it misrepresents the relatedness between individuals. Did I misunderstand your comment of:

For this route, it would be ideal to keep all pairwise (estimated) relatedness within a cluster

from staar.

xihaoli commented on July 23, 2024

Hi Eugene,

My apologies, the thresh parameter refers to kins_cutoff in the fit_null_glmmkin() function;
Your understanding is exactly correct. The approach that "keeps all pairwise relatedness within a cluster, even if they are below the threshold" is considered to be more optimal than the "direct thresholding" approach, since the latter misrepresents the relatedness between individuals;
The makeSparseMatrix() function implements the "optimal" approach as described above, however, it assumes a dense GRM as input. To do this, you could first convert the sparse GRM back to a dense GRM and use that as the input of either the fit_null_glmmkin() function with different values of kins_cutoff (or the makeSparseMatrix() function, with different values of thresh).

Hope this is more clear.

Best,
Xihao

from staar.

eugenegardner commented on July 23, 2024

Hello Xihao,

I've recently noticed via testing additional phenotypes that this seems very dependent on the case/control imbalance of the phenotype being studied. In a set of ~200,000 individuals, phenotypes with a prevalence of < ~0.2% seem to always run into the Cholesky issue I mentioned in my original post. Just wondering if you have any advice on how to handle this as the thresholding issue now seems suboptimal when it is not possible to know a priori what the threshold should be in the first place.

from staar.

xihaoli commented on July 23, 2024

Hi Eugene,
Thanks for letting me know. We will take a look at this and get back to you.
Best,
Xihao

from staar.

hanchenphd commented on July 23, 2024

Hello Xihao,

I've recently noticed via testing additional phenotypes that this seems very dependent on the case/control imbalance of the phenotype being studied. In a set of ~200,000 individuals, phenotypes with a prevalence of < ~0.2% seem to always run into the Cholesky issue I mentioned in my original post. Just wondering if you have any advice on how to handle this as the thresholding issue now seems suboptimal when it is not possible to know a priori what the threshold should be in the first place.

Did you always see the Cholesky issue in this subset (regardless of which phenotypes), or just sometimes? The kinship coefficients provided by the UK Biobank do not give a positive definite kinship matrix (due to the cut-off at 0.0442), but this may or may not be the issue for you, depending on which subset you were using. If the kinship matrix for your subset is not positive definite, I think there was really just one block of a few hundred individuals that caused the problem. You could drop these individuals to avoid the Cholesky issue. A better solution would be the UK Biobank fixes this block of individuals by filling in values that have been zeroed out (e.g., 0.0441) to make it positive definite, and release a new kinship matrix.

If you only saw this issue sometimes, and got a counter-example that successfully converged in the same subset (with a different or a simulated phenotype), then it was likely a convergence problem related to your case/control phenotypes.

Best,
Han

from staar.

Colesky factorization failed about staar HOT 9 CLOSED

Comments (9)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent