Code Monkey home page Code Monkey logo

Comments (9)

willtownes avatar willtownes commented on July 30, 2024

Thanks for another interesting data experiment. I'm not able to reproduce this. Could you please provide the random seed. It may be that it was just an unlucky initialization. I acknowledge the current random initialization is not ideal, but I haven't been able to get any other initialization tried so far to produce better average case performance in terms of convergence time, numerical stability etc. It may also be helpful to look at the deviance trace plot as a diagnostic for failed convergence (plot(fit$dev,type="l")) where fit is the glmpca object in the metadata of the SingleCellExperiment. By the way, when I ran this with L=2 it took 22 sec on my laptop (it took 24 sec with L=20) and this is what the factors look like:

image

from scry.

LTLA avatar LTLA commented on July 30, 2024

Interesting. I am consistently getting that error above. If it helps, I'll set the seed, let's say to 50:

library(BiocFileCache)
bfc <- BiocFileCache(ask=FALSE)
qcdata <- bfcrpath(bfc, "https://github.com/LuyiTian/CellBench_data/blob/master/data/mRNAmix_qc.RData?raw=true")

env <- new.env()
load(qcdata, envir=env)
sce.8qc <- env$sce8_qc
sce.8qc$mix <- factor(sce.8qc$mix)

# Library size normalization and log-transformation.
library(scuttle)
library(scran)
set.seed(50)

sce.8qc <- logNormCounts(sce.8qc)
dec.8qc <- modelGeneVar(sce.8qc)
hvgs.8qc <- getTopHVGs(dec.8qc, n=1000)

library(scry)
sce.8qc2 <- GLMPCA(sce.8qc[hvgs.8qc,], fam="nb",
    L=20, sz=sizeFactors(sce.8qc), verbose=TRUE)
## Control parameter 'penalty' should be provided as an element of 'ctl' rather than a separate argument.
## Control parameter 'verbose' should be provided as an element of 'ctl' rather than a separate argument.
## Control parameter 'eps' is deprecated. Coercing to equivalent 'tol'. Please use 'tol' in the future as 'eps' will eventually be removed
## Error: Poor model fit (final deviance higher than initial deviance). Try modifying control parameters to improve optimization.

With verbose=TRUE, it rambles on for a while:

Trying AvaGrad with learning rate: 5.12e-08
Iteration: 1 | deviance=2.01e+05 | nb_theta: 1
Iteration: 2 | deviance=2.486e+05 | nb_theta: 1.54
Iteration: 3 | deviance=2.805e+05 | nb_theta: 2
Iteration: 4 | deviance=3.018e+05 | nb_theta: 2.37
Iteration: 5 | deviance=3.163e+05 | nb_theta: 2.65
Iteration: 6 | deviance=3.262e+05 | nb_theta: 2.85
Iteration: 7 | deviance=3.33e+05 | nb_theta: 3.01
Iteration: 8 | deviance=3.378e+05 | nb_theta: 3.12
Iteration: 9 | deviance=3.411e+05 | nb_theta: 3.2
Iteration: 10 | deviance=3.435e+05 | nb_theta: 3.25
Iteration: 11 | deviance=3.451e+05 | nb_theta: 3.29
Iteration: 12 | deviance=3.463e+05 | nb_theta: 3.32
Iteration: 13 | deviance=3.471e+05 | nb_theta: 3.34
Iteration: 14 | deviance=3.477e+05 | nb_theta: 3.36
Iteration: 15 | deviance=3.481e+05 | nb_theta: 3.37
Iteration: 16 | deviance=3.484e+05 | nb_theta: 3.38
Iteration: 17 | deviance=3.486e+05 | nb_theta: 3.38
Iteration: 18 | deviance=3.487e+05 | nb_theta: 3.38
Iteration: 19 | deviance=3.488e+05 | nb_theta: 3.39
Iteration: 20 | deviance=3.489e+05 | nb_theta: 3.39
Iteration: 21 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 22 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 23 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 24 | deviance=3.49e+05 | nb_theta: 3.39
Iteration: 25 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 26 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 27 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 28 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 29 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 30 | deviance=3.491e+05 | nb_theta: 3.39
Iteration: 31 | deviance=3.491e+05 | nb_theta: 3.39
Error: Poor model fit (final deviance higher than initial deviance). Try modifying control parameters to improve optimization.

It's also kind of bemusing that the deviance is steadily increasing... I would have expected it to decrease.

from scry.

willtownes avatar willtownes commented on July 30, 2024

I haven't forgotten about this, sorry for the delay. One clarification- do you install all the packages from bioc development or from their respective github repositories?

from scry.

LTLA avatar LTLA commented on July 30, 2024

BioC-devel. Of course, I could use the GitHub repos locally, but that just kicks the can down the road as the OSCA book will only build using packages from the official BioC-devel channel.

from scry.

willtownes avatar willtownes commented on July 30, 2024

OK I finally was able to reproduce this bug. The bug occurs for a variety of random seeds, what matters is that scry must be installed from Bioconductor development branch. When I installed scry from github, the bug does not occur for any random seed. Also, it doesn't occur if I run if I call glmpca directly from the cran package without the scry wrapper. For example, replace the last two lines in Aaron's original code with

library(glmpca) # from CRAN
m<-assay(sce.8qc[hvgs.8qc, ], "counts")
fit<-glmpca(m, L=20, fam="nb", sz=sizeFactors(sce.8qc), verbose=TRUE)

What seems to be happening is the convergence happens at about 3.5e5 deviance for all initializations, but the Bioc development version weirdly produces an initialization having a lower deviance (2.0e5). Both the scry github version and the glmpca CRAN version initialize with deviance about 7.3e5. Another wrinkle is with scry github and glmpca CRAN, the nb_theta is initialized to 100, whereas with the bioc devel version it initializes to 1.

from scry.

willtownes avatar willtownes commented on July 30, 2024

By the way it is possible for the deviance to increase slightly sometimes at the end of the optimization due to the momentum aspect of the optimizer. Also, the deviance is only defined conditional on a particular nb_theta value so it's possible to have a lower deviance but a worse set of latent factors. This further underscores the need to improve the estimation of nb_theta in glmpca.

from scry.

willtownes avatar willtownes commented on July 30, 2024

OK I have now reproduced this bug in all implementations by setting nb_theta=1. @LTLA as a quick fix please try re-running your example but set nb_theta=100. It's still a bug because it shouldn't be defaulting to such a low nb_theta initialization.

from scry.

willtownes avatar willtownes commented on July 30, 2024

confirmed by checking source code that bioc-devel branch has an old implementation (bioc-release is actually further along). This old default has nb_theta=1. The new version has nb_theta=100. Once we update the bioc-devel the problem should go away.

from scry.

LTLA avatar LTLA commented on July 30, 2024

Yes, this fixes the problem. Finishes faster too.

Bit surprised that BioC-release is further along than BioC-devel. Probably preaching to the choir here, but I'd be pretty scared to push stuff to BioC-release without it having run the BBS gauntlet in BioC-devel.

from scry.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.