Similar to <a class="issue-link js-issue-link" data-error-text="Failed to load title"

Experiences with the book, round 2 about scry HOT 9 OPEN

LTLA commented on July 30, 2024

Experiences with the book, round 2

from scry.

Comments (9)

willtownes commented on July 30, 2024

Thanks for another interesting data experiment. I'm not able to reproduce this. Could you please provide the random seed. It may be that it was just an unlucky initialization. I acknowledge the current random initialization is not ideal, but I haven't been able to get any other initialization tried so far to produce better average case performance in terms of convergence time, numerical stability etc. It may also be helpful to look at the deviance trace plot as a diagnostic for failed convergence (plot(fit$dev,type="l")) where fit is the glmpca object in the metadata of the SingleCellExperiment. By the way, when I ran this with L=2 it took 22 sec on my laptop (it took 24 sec with L=20) and this is what the factors look like:

from scry.

LTLA commented on July 30, 2024

Interesting. I am consistently getting that error above. If it helps, I'll set the seed, let's say to 50:

library(BiocFileCache)
bfc <- BiocFileCache(ask=FALSE)
qcdata <- bfcrpath(bfc, "https://github.com/LuyiTian/CellBench_data/blob/master/data/mRNAmix_qc.RData?raw=true")

env <- new.env()
load(qcdata, envir=env)
sce.8qc <- env$sce8_qc
sce.8qc$mix <- factor(sce.8qc$mix)

# Library size normalization and log-transformation.
library(scuttle)
library(scran)
set.seed(50)

sce.8qc <- logNormCounts(sce.8qc)
dec.8qc <- modelGeneVar(sce.8qc)
hvgs.8qc <- getTopHVGs(dec.8qc, n=1000)

library(scry)
sce.8qc2 <- GLMPCA(sce.8qc[hvgs.8qc,], fam="nb",
    L=20, sz=sizeFactors(sce.8qc), verbose=TRUE)
## Control parameter 'penalty' should be provided as an element of 'ctl' rather than a separate argument.
## Control parameter 'verbose' should be provided as an element of 'ctl' rather than a separate argument.
## Control parameter 'eps' is deprecated. Coercing to equivalent 'tol'. Please use 'tol' in the future as 'eps' will eventually be removed
## Error: Poor model fit (final deviance higher than initial deviance). Try modifying control parameters to improve optimization.

With verbose=TRUE, it rambles on for a while:

Trying AvaGrad with learning rate: 5.12e-08
Iteration: 1 | deviance=2.01e+05 | nb_theta: 1
Iteration: 2 | deviance=2.486e+05 | nb_theta: 1.54
Iteration: 3 | deviance=2.805e+05 | nb_theta: 2
Iteration: 4 | deviance=3.018e+05 | nb_theta: 2.37
Iteration: 5 | deviance=3.163e+05 | nb_theta: 2.65
Iteration: 6 | deviance=3.262e+05 | nb_theta: 2.85
Iteration: 7 | deviance=3.33e+05 | nb_theta: 3.01
Iteration: 8 | deviance=3.378e+05 | nb_theta: 3.12
Iteration: 9 | deviance=3.411e+05 | nb_theta: 3.2
Iteration: 10 | deviance=3.435e+05 | nb_theta: 3.25
Iteration: 11 | deviance=3.451e+05 | nb_theta: 3.29
Iteration: 12 | deviance=3.463e+05 | nb_theta: 3.32
Iteration: 13 | deviance=3.471e+05 | nb_theta: 3.34
Iteration: 14 | deviance=3.477e+05 | nb_theta: 3.36
Iteration: 15 | deviance=3.481e+05 | nb_theta: 3.37
Iteration: 16 | deviance=3.484e+05 | nb_theta: 3.38
Iteration: 17 | deviance=3.486e+05 | nb_theta: 3.38
Iteration: 18 | deviance=3.487e+05 | nb_theta: 3.38
Iteration: 19 | deviance=3.488e+05 | nb_theta: 3.39
Iteration: 20 | deviance=3.489e+05 | nb_theta: 3.39
Iteration: 21 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 22 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 23 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 24 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 25 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 26 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 27 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 28 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 29 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 30 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 31 | deviance=3.491e+05 | nb_theta: 3.39
Error: Poor model fit (final deviance higher than initial deviance). Try modifying control parameters to improve optimization.

It's also kind of bemusing that the deviance is steadily increasing... I would have expected it to decrease.

from scry.

willtownes commented on July 30, 2024

I haven't forgotten about this, sorry for the delay. One clarification- do you install all the packages from bioc development or from their respective github repositories?

from scry.

LTLA commented on July 30, 2024

BioC-devel. Of course, I could use the GitHub repos locally, but that just kicks the can down the road as the OSCA book will only build using packages from the official BioC-devel channel.

from scry.

willtownes commented on July 30, 2024

OK I finally was able to reproduce this bug. The bug occurs for a variety of random seeds, what matters is that scry must be installed from Bioconductor development branch. When I installed scry from github, the bug does not occur for any random seed. Also, it doesn't occur if I run if I call glmpca directly from the cran package without the scry wrapper. For example, replace the last two lines in Aaron's original code with

library(glmpca) # from CRAN
m<-assay(sce.8qc[hvgs.8qc, ], "counts")
fit<-glmpca(m, L=20, fam="nb", sz=sizeFactors(sce.8qc), verbose=TRUE)

What seems to be happening is the convergence happens at about 3.5e5 deviance for all initializations, but the Bioc development version weirdly produces an initialization having a lower deviance (2.0e5). Both the scry github version and the glmpca CRAN version initialize with deviance about 7.3e5. Another wrinkle is with scry github and glmpca CRAN, the nb_theta is initialized to 100, whereas with the bioc devel version it initializes to 1.

from scry.

willtownes commented on July 30, 2024

By the way it is possible for the deviance to increase slightly sometimes at the end of the optimization due to the momentum aspect of the optimizer. Also, the deviance is only defined conditional on a particular nb_theta value so it's possible to have a lower deviance but a worse set of latent factors. This further underscores the need to improve the estimation of nb_theta in glmpca.

from scry.

willtownes commented on July 30, 2024

OK I have now reproduced this bug in all implementations by setting nb_theta=1. @LTLA as a quick fix please try re-running your example but set nb_theta=100. It's still a bug because it shouldn't be defaulting to such a low nb_theta initialization.

from scry.

willtownes commented on July 30, 2024

confirmed by checking source code that bioc-devel branch has an old implementation (bioc-release is actually further along). This old default has nb_theta=1. The new version has nb_theta=100. Once we update the bioc-devel the problem should go away.

from scry.

LTLA commented on July 30, 2024

Yes, this fixes the problem. Finishes faster too.

Bit surprised that BioC-release is further along than BioC-devel. Probably preaching to the choir here, but I'd be pretty scared to push stuff to BioC-release without it having run the BBS gauntlet in BioC-devel.

from scry.

Experiences with the book, round 2 about scry HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent