Comments (9)
Hi Eugene,
Thanks for your message. This error seems to be caused by some numerical issues when fitting the null model. One quick consideration is to use the fitNullModel() function from the GENESIS package to fit the null model (with the current GRM), and then use the genesis2staar_nullmodel() function from the STAARpipeline package to convert the GENESIS null model object to a STAAR null model object. Both STAAR and GENESIS share the same statistical null model when performing association tests.
Please feel free to let me know if this works.
Best,
Xihao
from staar.
Hello Xihao,
Thank you for the quick reply. Unfortunately, when I implemented the null model via GENESIS as suggested:
obj_nullmodel <- fitNullModel(x = data_for_STAAR, outcome = pheno_name, covars = covariates, cov.mat = sparse_kinship, family = "gaussian", rescale = "none")
I get a very similar error:
Computing Variance Component Estimates...
Sigma^2_A log-lik RSS
[1] 4.958985e-01 4.958985e-01 -5.903154e+05 1.313380e+00
[1] 9.029733e-01 4.702472e-01 -5.901148e+05 1.061299e+00
Error in .local(x, ...) :
internal_chm_factor: Cholesky factorization failed
In addition: Warning message:
In .local(x, ...) :
Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 430
Any suggestions?
I would hope this is very easy to reproduce. I simply acquired the sparse matrix (.rel) from The UK Biobank website, and converted it into a symmetric dsTMatrix from the Matrix package. I then limited it to individuals included in the covariate table (given to 'x' above).
from staar.
Hi Eugene,
Thank you very much for letting me know.
In this case, I think you could try some different thresholds of the sparse GRM by gradually increasing the threshold from 0.0442 to something slightly higher and see if the null model would fit. For this route, it would be ideal to keep all pairwise (estimated) relatedness within a cluster, even if they are below the threshold, rather than setting all entries below the threshold to 0 in the sparse GRM. To do this, the fit_null_glmmkin()
function from the STAAR package has a parameter thresh
to convert an input dense GRM to a sparse GRM as defined above and then fit the null model. Equivalently, you could use the makeSparseMatrix()
function from the GENESIS package to directly operate on the GRM before fitting the null model.
Hope this helps!
Best,
Xihao
from staar.
Hi Xihao,
I don't think there is a 'thresh' parameter? Did you mean kins_cutoff?
I will try the other method as well.
from staar.
Hi Xihao,
I tried both approaches and still got the same error. Is markSparseMatrix() really the solution given that the matrix is already sparse?
I used a slightly different (external) approach to threshold and it seemed to work:
sparse_kinship@x <- sapply(sparse_kinship@x, function(x) if_else(x < 0.05, 0.05, x))
Where 0.05 is the threshold. This seems a bit like a suboptimal solution since it misrepresents the relatedness between individuals. Did I misunderstand your comment of:
For this route, it would be ideal to keep all pairwise (estimated) relatedness within a cluster
?
from staar.
Hi Eugene,
- My apologies, the
thresh
parameter refers tokins_cutoff
in thefit_null_glmmkin()
function; - Your understanding is exactly correct. The approach that "keeps all pairwise relatedness within a cluster, even if they are below the threshold" is considered to be more optimal than the "direct thresholding" approach, since the latter misrepresents the relatedness between individuals;
- The
makeSparseMatrix()
function implements the "optimal" approach as described above, however, it assumes a dense GRM as input. To do this, you could first convert the sparse GRM back to a dense GRM and use that as the input of either thefit_null_glmmkin()
function with different values ofkins_cutoff
(or themakeSparseMatrix()
function, with different values ofthresh
).
Hope this is more clear.
Best,
Xihao
from staar.
Hello Xihao,
I've recently noticed via testing additional phenotypes that this seems very dependent on the case/control imbalance of the phenotype being studied. In a set of ~200,000 individuals, phenotypes with a prevalence of < ~0.2% seem to always run into the Cholesky issue I mentioned in my original post. Just wondering if you have any advice on how to handle this as the thresholding issue now seems suboptimal when it is not possible to know a priori what the threshold should be in the first place.
from staar.
Hi Eugene,
Thanks for letting me know. We will take a look at this and get back to you.
Best,
Xihao
from staar.
Hello Xihao,
I've recently noticed via testing additional phenotypes that this seems very dependent on the case/control imbalance of the phenotype being studied. In a set of ~200,000 individuals, phenotypes with a prevalence of < ~0.2% seem to always run into the Cholesky issue I mentioned in my original post. Just wondering if you have any advice on how to handle this as the thresholding issue now seems suboptimal when it is not possible to know a priori what the threshold should be in the first place.
Did you always see the Cholesky issue in this subset (regardless of which phenotypes), or just sometimes? The kinship coefficients provided by the UK Biobank do not give a positive definite kinship matrix (due to the cut-off at 0.0442), but this may or may not be the issue for you, depending on which subset you were using. If the kinship matrix for your subset is not positive definite, I think there was really just one block of a few hundred individuals that caused the problem. You could drop these individuals to avoid the Cholesky issue. A better solution would be the UK Biobank fixes this block of individuals by filling in values that have been zeroed out (e.g., 0.0441) to make it positive definite, and release a new kinship matrix.
If you only saw this issue sometimes, and got a counter-example that successfully converged in the same subset (with a different or a simulated phenotype), then it was likely a convergence problem related to your case/control phenotypes.
Best,
Han
from staar.
Related Issues (15)
- error: caught segfault
- Handling missing annotations HOT 2
- Ranks Used For Generating Phred Scores For Functional Annotations HOT 2
- Using Categorical Annotations with Multiple Categories in STAAR. HOT 2
- Calculation of Phred Scores for Functional/Deleterious Annotations HOT 4
- Question About Assessing Relative Importance of Tests and Annotations HOT 1
- How to generate the genotype file more quickly? HOT 3
- ERROR: compilation failed for package βSTAARβ HOT 3
- Script for calculation of annotation PCs (aPCs) and source file for annotations HOT 1
- decomposition failed error HOT 5
- docker image of STAAR HOT 1
- How to simply use vcf format file as input HOT 3
- Installation failed HOT 4
- Compilation error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from staar.