rmcelreath / glmer2stan Goto Github PK

Define Stan models using glmer-style (lme4) formulas

R 100.00%

glmer2stan's Introduction

glmer2stan

Define Stan models using glmer-style (lme4) formulas.

Stan (mc-stan.org) is a Hamiltonian Monte Carlo engine for fitting Bayesian models to data.

glmer2stan compiles design formulas, such as y ~ (1|id) + x, into Stan model code. This allows the specification of simple multilevel models, using familiar formula syntax of the kind many people have learned from popular R packages like lme4.

The Stan code that glmer2stan produces is didactic, not efficient. So instead of using design matrices for varying and fixed effects, it builds out explicit additive expressions. This makes the code easier for novices to understand. But it also reduces speed a little. Since I originally conceived of glmer2stan as a teaching tool, the Stan code will probably remain didactic. But I might eventually make an option to use faster-but-opaque techniques.

Status

Current state: All models pass tests, DIC computations validated. Basic prediction function added, but still needs work. WAIC is in and working for single-formula models, but still needs some more thorough numerical testing, to find corner cases.

Horizon: I'm not planning on developing glmer2stan much further. The problem is that I find it unsatisfying both as a teaching tool and as a multilevel regression tool.

It is unsatisfying for teaching, because students don't tend use it as a way to learn the connections between Stan code and glmer formulas, but rather as a way to avoid learning Stan code. For those who are using glmer2stan as a teaching tool, it remains and will remain useful, as it exists now. I don't think everyone needs to learn Stan code, mind you. But I would like a tool that helps more in that regard.

It is unsatisfying as a multilevel regression tool, because there's no natural way to allow prior specifications. And the natural default priors are the same flat (or nearly flat) priors as glmer, which leads to lots of problems with Stan's HMC. Also, complex issues like specifying covariance priors are very hard to approach at all with glmer's syntax. This isn't a criticism of glmer (which remains a great tool), but rather just a comment on how far one can reasonably push its syntax.

So caught between bad defaults and no easy input syntax (other than editing Stan code directly), I'd rather give my time to another project that addresses all these issues at once. That project (map2stan) is in my other package, rethinking, and it will be described in a book I'm writing. map2stan uses templating and all the other rational things I didn't use when hacking together glmer2stan. So it'll also be able to grow as Stan grows.

I think it might be worth using some glmer2stan code to translate glmer formulas into map2stan input formulas, which do lay bare all the model assumptions. But I'm still figuring out how to proceed. My primary interests are pedagogical here, so I'm trying to tune the tools so they balance convenience and understanding.

Examples

(1) Gaussian with varying intercepts and slopes

library(lme4)
library(glmer2stan)

data(sleepstudy) # built into lme4

# fit with lme4
m1_lme4 <- lmer( Reaction ~ Days + (Days | Subject), sleepstudy, REML=FALSE )

# construct subject index --- glmer2stan forces you to manage your own cluster indices
sleepstudy$subject_index <- as.integer(as.factor(sleepstudy$Subject))

# fit with glmer2stan
m1_g2s <- lmer2stan( Reaction ~ Days + (Days | subject_index), data=sleepstudy )

# [timings for 3GHz i5, no memory limitations]
#Elapsed Time: 46.1606 seconds (Warm-up)
#              32.3429 seconds (Sampling)
#              78.5035 seconds (Total)

summary(m1_lme4)

#Linear mixed model fit by maximum likelihood ['lmerMod']
#Formula: Reaction ~ Days + (Days | Subject) 
#   Data: sleepstudy
#
#      AIC       BIC    logLik  deviance 
#1763.9393 1783.0971 -875.9697 1751.9393 
#
#Random effects:
# Groups   Name        Variance Std.Dev. Corr
# Subject  (Intercept) 565.52   23.781       
#          Days         32.68    5.717   0.08
# Residual             654.94   25.592       
#Number of obs: 180, groups: Subject, 18
#
#Fixed effects:
#            Estimate Std. Error t value
#(Intercept)  251.405      6.632   37.91
#Days          10.467      1.502    6.97
#
#Correlation of Fixed Effects:
#     (Intr)
#Days -0.138

stanmer(m1_g2s)

#glmer2stan model: Reaction ~ Days + (Days | subject_index) [gaussian]
#
#Level 1 estimates:
#            Expectation StdDev   2.5%  97.5%
#(Intercept)      249.73   7.70 234.51 265.10
#Days              10.40   1.67   7.08  13.73
#sigma             25.76   1.50  22.99  28.80
#
#Level 2 estimates:
#(Std.dev. and correlations)
#
#Group: subject_index (18 groups / imbalance: 0)
#                 
#                    (1)  (2)
#  (1) (Intercept) 32.33   NA
#  (2) Days        -0.03 7.56
#
#DIC: 1711   pDIC: 31.9   Deviance: 1647.2

# compare varying effect estimates with:
ranef(m1_lme4)
varef(m1_g2s)$expectation

# extract posterior samples
posterior <- extract(m1_g2s)
str(posterior)

# compute posterior predictions
pp <- stanpredict(m1_g2s,data=sleepstudy)
str(pp)

#List of 1
# $ Reaction:List of 3
#  ..$ mu    : num [1:180] 252 272 292 312 332 ...
#  ..$ mu.ci : num [1:2, 1:180] 225 278 249 294 273 ...
#  .. ..- attr(*, "dimnames")=List of 2
#  .. .. ..$ : chr [1:2] "2.5%" "97.5%"
#  .. .. ..$ : NULL
#  ..$ obs.ci: num [1:2, 1:180] 195 308 216 325 238 ...
#  .. ..- attr(*, "dimnames")=List of 2
#  .. .. ..$ : chr [1:2] "2.5%" "97.5%"
#  .. .. ..$ : NULL

# compare to lme4 MAP/MLE predictions
predict(m1_lme4)

# plot comparison
plot( predict(m1_lme4) , pp$Reaction$mu )

You can expose the Stan model code by pulling out m1_g2s@stanmodel:

data{
    int N;
    real Reaction[N];
    real Days[N];
    int subject_index[N];
    int N_subject_index;
}

transformed data{
    vector[2] zeros_subject_index;
    for ( i in 1:2 ) zeros_subject_index[i] <- 0;
}

parameters{
    real Intercept;
    real beta_Days;
    real<lower=0> sigma;
    vector[2] vary_subject_index[N_subject_index];
    cov_matrix[2] Sigma_subject_index;
}

model{
    real vary[N];
    real glm[N];
    // Priors
    Intercept ~ normal( 0 , 100 );
    beta_Days ~ normal( 0 , 100 );
    sigma ~ uniform( 0 , 100 );
    // Varying effects
    for ( j in 1:N_subject_index ) vary_subject_index[j] ~ multi_normal( zeros_subject_index , Sigma_subject_index );
    // Fixed effects
    for ( i in 1:N ) {
        vary[i] <- vary_subject_index[subject_index[i],1]
                + vary_subject_index[subject_index[i],2] * Days[i];
        glm[i] <- vary[i] + Intercept
                + beta_Days * Days[i];
    }
    Reaction ~ normal( glm , sigma );
}

generated quantities{
    real dev;
    real vary[N];
    real glm[N];
    dev <- 0;
    for ( i in 1:N ) {
        vary[i] <- vary_subject_index[subject_index[i],1]
                + vary_subject_index[subject_index[i],2] * Days[i];
        glm[i] <- vary[i] + Intercept
                + beta_Days * Days[i];
        dev <- dev + (-2) * normal_log( Reaction[i] , glm[i] , sigma );
    }
}

(2) Binomial with varying intercepts

data(cbpp) # built into lme4

m2_lme4 <- glmer( cbind(incidence,size-incidence) ~ period + (1|herd) , data=cbpp , family=binomial )

cbpp$herd_index <- as.integer(cbpp$herd)
m2_g2s <- glmer2stan( cbind(incidence,size-incidence) ~ period + (1|herd_index) , data=cbpp , family="binomial" )

summary(m2_lme4)
stanmer(m2_g2s)

The Stan model code, accessed by m2_g2s@stanmodel:

data{
    int N;
    int incidence[N];
    real period2[N];
    real period3[N];
    real period4[N];
    int herd_index[N];
    int bin_total[N];
    int N_herd_index;
}

parameters{
    real Intercept;
    real beta_period2;
    real beta_period3;
    real beta_period4;
    real vary_herd_index[N_herd_index];
    real<lower=0> sigma_herd_index;
}

model{
    real vary[N];
    real glm[N];
    // Priors
    Intercept ~ normal( 0 , 100 );
    beta_period2 ~ normal( 0 , 100 );
    beta_period3 ~ normal( 0 , 100 );
    beta_period4 ~ normal( 0 , 100 );
    sigma_herd_index ~ uniform( 0 , 100 );
    // Varying effects
    for ( j in 1:N_herd_index ) vary_herd_index[j] ~ normal( 0 , sigma_herd_index );
    // Fixed effects
    for ( i in 1:N ) {
        vary[i] <- vary_herd_index[herd_index[i]];
        glm[i] <- vary[i] + Intercept
                + beta_period2 * period2[i]
                + beta_period3 * period3[i]
                + beta_period4 * period4[i];
        glm[i] <- inv_logit( glm[i] );
    }
    incidence ~ binomial( bin_total , glm );
}

generated quantities{
    real dev;
    real vary[N];
    real glm[N];
    dev <- 0;
    for ( i in 1:N ) {
        vary[i] <- vary_herd_index[herd_index[i]];
        glm[i] <- vary[i] + Intercept
                + beta_period2 * period2[i]
                + beta_period3 * period3[i]
                + beta_period4 * period4[i];
        dev <- dev + (-2) * binomial_log( incidence[i] , bin_total[i] , inv_logit(glm[i]) );
    }
}

(3) Zero-augmented (zero-inflated) gamma model

This is really a two-outcome non-linear factor analysis (or Gaussian process) of a sort, using varying intercepts to relate outcomes from the same individuals. It demonstrates how to specify models with more than one formula and use varying effects to link them together. This model cannot be fit with lme4, but the formula syntax still follows lme4 conventions (mostly).

data(Ache) # built into glmer2stan

# prepare two outcome formula list
the_formula <- list(
		iszero ~ (1|hunter.id) ,
		nonzero ~ (1|hunter.id)
	)

# note the list of family names in call to glmer2stan
m3 <- glmer2stan( the_formula , data=Ache , family=list("binomial","gamma") )

stanmer(m3)

# plot varying intercepts across outcomes, showing correlation
v <- varef(m3)
plot( v$expectation$hunter_id[,1] , v$expectation$hunter_id[,2] , xlab="hunter effect (failures)" , ylab="hunter effect (harvests)" )

As before, you can view the Stan code by extracting m3@stanmodel.

glmer2stan's People

Stargazers

Watchers

Forkers

maverickg gragusa lindemann09 stanke jasongraf1 loeckete guhjy marek1914 hal2001 karthy257 biostatsreportautomation

glmer2stan's Issues

Variable name conflicts

When a variable is named (e.g.) 'N', there is a conflict with the Stan code, because 'N' is used for the length of the outcome vector.

Proposed solution: Check all variable and parameter names against a reserved list.

Could also automatically rename inputs to resolve without user interaction, but that might generate confusion when coefficient table contains unexpected names. Keeping an alias list that is saved in the fit could address that, but then the internals would still have the unexpected names.

cbind(y,n-y) in formula causes error with stanpredict

Fitting a binomial model with the cbind outcome, with an inline construction of the failure count, later causes errors in stanpredict (and possibly other functions).

Problem arises because n-y gets constructed into the model frame, but stanpredict doesn't have access to the constructed variable.

Need to either fix or detect and warn.

Disappearing Random Effects (w/ fix)

Issue: Disappearing Random effects

While I was using glmer2stan to run my dissertation I noticed that some of my more complex models woud have myseriously disappearing random effects. I've replicated the bug below and my solution. Not sure if anyone else has experienced this, but I figured its worth documenting

library(lme4)
data(sleepstudy)
sleepstudy$Days2 = rnorm(length(sleepstudy$Days)) # just noise
sleepstudy$Days3 = rnorm(length(sleepstudy$Days)) # just noise
sleepstudy$Days4 = rnorm(length(sleepstudy$Days)) # just noise


sleepstudy$subject_index <- as.integer(as.factor(sleepstudy$Subject)) 

# basic MCMC parameters
niter = 500
nwarm = 100
chains = 4

#glmer2stan
example1 <- lmer2stan( Reaction ~ Days+Days2+Days3+Days4
                                 +Days:Days2
                                 +Days:Days2
                                 +Days:Days2:Days3:Days4
                       + (1 | subject_index), data=sleepstudy,
                     calcWAIC=T,
                     warmup=nwarm, #burn-in period
                     iter = niter, # number of steps per chain
                     chains=chains,sample=F) #number of chains
cat(example1$model)

The above code seems to shit the bed when there's a mix of high and low level interactions specified
with ':' instead of fully with '*'. Specifically when this mix is present, the model does not
contain any random effects (but will run just fine which is scary).

data{
    int N;
    real Reaction[N];
    real Days[N];
    real Days2[N];
    real Days3[N];
    real Days4[N];
    real subject_index[N]; //okay the index is obviously in the data...
    real Days_X_Days2[N];
    real Days_X_Days2_X_Days3_X_Days4[N];
}

parameters{
    real Intercept;
    real beta_Days;
    real beta_Days2;
    real beta_Days3;
    real beta_Days4;
    real beta_Days_X_Days2;
    real beta_Days_X_Days2_X_Days3_X_Days4;
    real<lower=0> sigma;
}

model{
    real vary[N];
    real glm[N];
    // Priors
    Intercept ~ normal( 0 , 100 );
    beta_Days ~ normal( 0 , 100 );
    beta_Days2 ~ normal( 0 , 100 );
    beta_Days3 ~ normal( 0 , 100 );
    beta_Days4 ~ normal( 0 , 100 );
    beta_Days_X_Days2 ~ normal( 0 , 100 );
    beta_Days_X_Days2_X_Days3_X_Days4 ~ normal( 0 , 100 );
    sigma ~ uniform( 0 , 100 );
    // WTF where are my random effects??

    // Fixed effects
    for ( i in 1:N ) {
        glm[i] <- Intercept
                + beta_Days * Days[i]
                + beta_Days2 * Days2[i]
                + beta_Days3 * Days3[i]
                + beta_Days4 * Days4[i]
                + beta_Days_X_Days2 * Days_X_Days2[i]
                + beta_Days_X_Days2_X_Days3_X_Days4 * Days_X_Days2_X_Days3_X_Days4[i];
    }
    Reaction ~ normal( glm , sigma );
}

generated quantities{
    real dev;
    real vary[N];
    real glm[N];
    dev <- 0;
    for ( i in 1:N ) {
        glm[i] <- Intercept
                + beta_Days * Days[i]
                + beta_Days2 * Days2[i]
                + beta_Days3 * Days3[i]
                + beta_Days4 * Days4[i]
                + beta_Days_X_Days2 * Days_X_Days2[i]
                + beta_Days_X_Days2_X_Days3_X_Days4 * Days_X_Days2_X_Days3_X_Days4[i];
        dev <- dev + (-2) * normal_log( Reaction[i] , glm[i] , sigma );
    }
}

The source of the issue is the parse_formula function. basically, lines 66 - 69

if (formula == nobars(formula)) {
     ranef <- list()
   }

kill the random effects for some reason. I understand they are there in case the user is actually running a model with no random effects, e.g. a bivariate regression or something simple, but for some reason the inclusion of a mix of ':' throws things off.

In order to run my analyses, which had a mix of high and low-level interactions I tweaked your code slightly and wrote myglmer2stan. I included a function arguement 'Ranef' which specifies explicitly wehter or not there are random effects present. If there are, parse formula is declared without lines 66-69. If there are no randome effects than the existing parse formula function is specified. my function can be found here.

example2 <- myglmer2stan( Reaction ~ Days+Days2+Days3+Days4
                       +Days:Days2
                       +Days:Days2
                       +Days:Days2:Days3:Days4
                       + (1 | subject_index), data=sleepstudy,
                       calcWAIC=T,
                       warmup=nwarm, #burn-in period
                       iter = niter, # number of steps per chain
                       chains=chains,sample=F) #number of chains
cat(example2$model)

no more issue!

data{
    int N;
    real Reaction[N];
    real Days[N];
    real Days2[N];
    real Days3[N];
    real Days4[N];
    int subject_index[N];
    real Days_X_Days2[N];
    real Days_X_Days2_X_Days3_X_Days4[N];
    int N_subject_index;
}

parameters{
    real Intercept;
    real beta_Days;
    real beta_Days2;
    real beta_Days3;
    real beta_Days4;
    real beta_Days_X_Days2;
    real beta_Days_X_Days2_X_Days3_X_Days4;
    real<lower=0> sigma;
    real vary_subject_index[N_subject_index];
    real<lower=0> sigma_subject_index;
}

model{
    real vary[N];
    real glm[N];
    // Priors
    Intercept ~ normal( 0 , 100 );
    beta_Days ~ normal( 0 , 100 );
    beta_Days2 ~ normal( 0 , 100 );
    beta_Days3 ~ normal( 0 , 100 );
    beta_Days4 ~ normal( 0 , 100 );
    beta_Days_X_Days2 ~ normal( 0 , 100 );
    beta_Days_X_Days2_X_Days3_X_Days4 ~ normal( 0 , 100 );
    sigma_subject_index ~ uniform( 0 , 100 );
    sigma ~ uniform( 0 , 100 );
    // Varying effects THEYRE STILL HERE!!
    for ( j in 1:N_subject_index ) vary_subject_index[j] ~ normal( 0 , sigma_subject_index );
    // Fixed effects
    for ( i in 1:N ) {
        vary[i] <- vary_subject_index[subject_index[i]];
        glm[i] <- vary[i] + Intercept
                + beta_Days * Days[i]
                + beta_Days2 * Days2[i]
                + beta_Days3 * Days3[i]
                + beta_Days4 * Days4[i]
                + beta_Days_X_Days2 * Days_X_Days2[i]
                + beta_Days_X_Days2_X_Days3_X_Days4 * Days_X_Days2_X_Days3_X_Days4[i];
    }
    Reaction ~ normal( glm , sigma );
}

generated quantities{
    real dev;
    real vary[N];
    real glm[N];
    dev <- 0;
    for ( i in 1:N ) {
        vary[i] <- vary_subject_index[subject_index[i]];
        glm[i] <- vary[i] + Intercept
                + beta_Days * Days[i]
                + beta_Days2 * Days2[i]
                + beta_Days3 * Days3[i]
                + beta_Days4 * Days4[i]
                + beta_Days_X_Days2 * Days_X_Days2[i]
                + beta_Days_X_Days2_X_Days3_X_Days4 * Days_X_Days2_X_Days3_X_Days4[i];
        dev <- dev + (-2) * normal_log( Reaction[i] , glm[i] , sigma );
    }
}

Inability to parse within formula variable transformations

First of all, thanks for this tremendous work. This is a huge step forward in increasing the accessibility of building Bayesian mixed-effect models efficiently and effectively.

I'm not sure that it is necessary to support the feature of within formula variable transformation, but it should be documented that inline variable transformations do not always compile as expected. Here is a simple example:

library(lme4)
library(rstan)
library(glmer2stan)
data(sleepstudy) # built into lme4

# fit with lme4
m1_lme4 <- lmer( Reaction ~ Days + (Days | Subject), sleepstudy, REML=FALSE )

# construct subject index --- glmer2stan forces you to manage your own cluster indices
sleepstudy$subject_index <- as.integer(as.factor(sleepstudy$Subject))

# fit with glmer2stan
m1_g2s <- lmer2stan( Reaction ~ Days + (Days | subject_index), data=sleepstudy )

# add squared terms

m2_lme4 <- lmer( Reaction ~ Days + I(Days^2) + (Days | Subject), sleepstudy, REML=FALSE )

m2_g2s <- lmer2stan( Reaction ~ Days + I(Days^2) + (Days | subject_index), data=sleepstudy )

In this case, m2_g2s fails to compile:

Starting Stan model

TRANSLATING MODEL 'Reaction ~ Days + I(Days^2) + (Days | subject_index) [gaussian]' FROM Stan CODE TO C++ CODE NOW.
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model 'Reaction ~ Days + I(Days^2) + (Days | subject_index) [gaussian]' with error message:
EXPECTATION FAILURE LOCATION: file=input; line=5, column=15

    real IDays^2[N];
              ^-- here


DIAGNOSTIC(S) FROM PARSER:
Parser expecting: ";"

And here is the sessionInfo(), though I have also tested on a 64bit SLES platform and found the same results:

R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.0.1      glmer2stan_0.994 inline_0.3.13    Rcpp_0.10.6      devtools_1.4.1  
[6] lme4_1.0-5       Matrix_1.1-0     lattice_0.20-24 

loaded via a namespace (and not attached):
 [1] codetools_0.2-8 digest_0.6.4    evaluate_0.5.1  grid_3.0.2      httr_0.2       
 [6] MASS_7.3-29     memoise_0.1     minqa_1.2.1     nlme_3.1-113    parallel_3.0.2 
[11] RCurl_1.95-4.1  splines_3.0.2   stats4_3.0.2    stringr_0.6.2   tools_3.0.2    
[16] whisker_0.3-2

I don't think this is a bug -- just a fact that should be documented at some point.

Enhancement: Feed model code directly into glmer2stan

I noticed it was kind of difficult to re-feed an edited STAN model into the glmer2stan function. So I also included an argument, 'mymodel', which allows the user to specify an R object of text as the STAN model. The idea being you run the the model with sampling turned off first, make your changes save that text to an object and rerun with tweaked priors. my function can be found here.

example3_noSamples <- lmer2stan( Reaction ~ Days + (Days | subject_index), data=sleepstudy,
                               calcWAIC=T,
                               warmup=100, 
                               iter = 500, 
                               chains=4,
                               sample=F) 

cat(example3_noSamples$model) # stan model as text

example3_priors = "data{
  int N;
  real Reaction[N];
real Days[N];
int subject_index[N];
int N_subject_index;
}

transformed data{
vector[2] zeros_subject_index;
for ( i in 1:2 ) zeros_subject_index[i] <- 0;
}

parameters{
real Intercept;
real beta_Days;
real<lower=0> sigma;
vector[2] vary_subject_index[N_subject_index];
cov_matrix[2] Sigma_subject_index;
}

model{
real vary[N];
real glm[N];
// Priors
Intercept ~ normal( 0 , 100 );
beta_Days ~ normal( 10 , 2 ); // Hey! Look! An informed prior!
sigma ~ uniform( 0 , 100 );
// Varying effects
for ( j in 1:N_subject_index ) vary_subject_index[j] ~ multi_normal( zeros_subject_index , 
Sigma_subject_index );
// Fixed effects
for ( i in 1:N ) {
vary[i] <- vary_subject_index[subject_index[i],1]
+ vary_subject_index[subject_index[i],2] * Days[i];
glm[i] <- vary[i] + Intercept
+ beta_Days * Days[i];
}
Reaction ~ normal( glm , sigma );
}

generated quantities{
real dev;
real vary[N];
real glm[N];
dev <- 0;
for ( i in 1:N ) {
vary[i] <- vary_subject_index[subject_index[i],1]
+ vary_subject_index[subject_index[i],2] * Days[i];
glm[i] <- vary[i] + Intercept
+ beta_Days * Days[i];
dev <- dev + (-2) * normal_log( Reaction[i] , glm[i] , sigma );
}
}"


example3_withpriors <- myglmer2stan( Reaction ~ Days + (Days | subject_index), data=sleepstudy,
                        calcWAIC=T,
                        warmup=nwarm, 
                        iter = niter,
                        chains=chains, 
                        mymodel=example3_priors # include object with edited model text
) 

stanmer(example3_withpriors)

glmer2stan model: Reaction ~ Days + (Days | subject_index) [gaussian]

Level 1 estimates:
            Expectation StdDev   2.5%  97.5%
(Intercept)      249.98   7.97 233.28 264.58
Days              10.25   1.34   7.67  13.04
sigma             25.81   1.48  22.97  28.73

Level 2 estimates:
(Std.dev. and correlations)

Group: subject_index (18 groups / imbalance: 0)

                    (1)  (2)
  (1) (Intercept) 32.31   NA
  (2) Days         0.00 7.33

DIC: 1712   pDIC: 32   Deviance: 1648.2

WAIC: 1720   pWAIC: 33.2   -2*lppd: 1653.2

rstan/tidyr extract() namespace conflict

Fail to run example at https://github.com/rmcelreath/glmer2stan#2-binomial-with-varying-intercepts

library(lme4)
devtools::install_github("rmcelreath/glmer2stan")
library(glmer2stan)

Input:

data(cbpp) # built into lme4

m2_lme4 <- glmer( cbind(incidence,size-incidence) ~ period + (1|herd) , data=cbpp , family=binomial )

cbpp$herd_index <- as.integer(cbpp$herd)
m2_g2s <- glmer2stan( cbind(incidence,size-incidence) ~ period + (1|herd_index) , data=cbpp , family="binomial" )

Error:

 Error in UseMethod("extract_") :    no applicable method for 'extract_' applied to an object of class "stanfit"

According to this issue: stan-dev/rstan#118, it seems rstan/tidyr extract() namespace conflict.

SessionInfo:

R version 3.4.3 (2017-11-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] glmer2stan_0.995    lme4_1.1-17         bindrcpp_0.2        MDPtoolbox_4.0.3    linprog_0.9-2       lpSolve_5.6.13     
 [7] Matrix_1.2-12       promises_1.0.1      sparklyr_0.7.0-9026 forcats_0.2.0       stringr_1.3.0       dplyr_0.7.4        
[13] purrr_0.2.4         readr_1.1.1         tidyr_0.8.0         tibble_1.4.2        tidyverse_1.2.1     rethinking_1.59    
[19] rstan_2.17.3        StanHeaders_2.17.2  ggplot2_2.2.1       RMySQL_0.10.13      DBI_0.8            

loaded via a namespace (and not attached):
  [1] minqa_1.2.4                 colorspace_1.3-2            smoof_1.5.1                 rsconnect_0.8.7            
  [5] rprojroot_1.3-2             base64enc_0.1-3             rstudioapi_0.7              listenv_0.7.0              
  [9] ParamHelpers_1.10           hash_2.2.6                  mvtnorm_1.0-7               lubridate_1.7.1            
 [13] xml2_1.2.0                  splines_3.4.3               codetools_0.2-15            mnormt_1.5-5               
 [17] knitr_1.20                  mco_1.0-15.1                zeallot_0.1.0               jsonlite_1.5               
 [21] nloptr_1.0.4                broom_0.4.4                 dbplyr_1.2.1                mlrMBO_1.1.1               
 [25] tfruns_1.3                  shiny_1.0.5                 compiler_3.4.3              httr_1.3.1                 
 [29] backports_1.1.2             assertthat_0.2.0            lazyeval_0.2.1              cli_1.0.0                  
 [33] later_0.7.1                 addinpromises_0.1.0         htmltools_0.3.6             tools_3.4.3                
 [37] misc3d_0.8-4                igraph_1.2.1                coda_0.19-1                 gtable_0.2.0               
 [41] glue_1.2.0                  reshape2_1.4.3              parallelMap_1.3             Rcpp_0.12.17               
 [45] cellranger_1.1.0            RJSONIO_1.3-0               nlme_3.1-131                psych_1.8.3.3              
 [49] mlr_2.12                    globals_0.11.0              rvest_0.3.2                 mime_0.5                   
 [53] devtools_1.13.4             future_1.8.1                MASS_7.3-47                 scales_0.5.0               
 [57] hms_0.4.0                   plot3D_1.1.1                inline_0.3.15               RColorBrewer_1.1-2         
 [61] BBmisc_1.11                 geohash_0.3.0               yaml_2.1.19                 curl_3.1                   
 [65] memoise_1.1.0               reticulate_1.7.1            gridExtra_2.3               loo_2.0.0                  
 [69] keras_2.1.6.9001            stringi_1.1.6               highr_0.6                   tensorflow_1.5.0.9001      
 [73] checkmate_1.8.5             lhs_0.16                    rlang_0.2.0                 pkgconfig_2.0.1            
 [77] matrixStats_0.53.1          evaluate_0.10.1             lattice_0.20-35             bindr_0.1                  
 [81] htmlwidgets_1.0             tidyselect_0.2.4            plyr_1.8.4                  magrittr_1.5               
 [85] R6_2.2.2                    pillar_1.1.0                haven_1.1.0                 whisker_0.3-2              
 [89] foreign_0.8-69              withr_2.1.2                 survival_2.41-3             tibbletime_0.1.0.9001      
 [93] modelr_0.1.1                crayon_1.3.4                utf8_1.1.3                  plotly_4.7.1               
 [97] rmarkdown_1.9               grid_3.4.3                  readxl_1.0.0                data.table_1.10.4-3        
[101] git2r_0.20.0                ReinforcementLearning_1.0.2 digest_0.6.15               xtable_1.8-2               
[105] httpuv_1.3.5                openssl_1.0.1               pool_0.1.4                  stats4_3.4.3               
[109] munsell_0.4.3               viridisLite_0.3.0

sflist2stanfit and glmer2stan not working perfectly together

Combining fits seems to work, but some metadata in lost in process, so functions like stanmer() don't work right.

missing a fixed effect level in model without intercept

This should work, using the VerbAgg dataset in lme4, but the Stan model doesn't compile:

library(glmer2stan)
library(lme4)
# glmer2stan doesn't like factor grouping variables
VerbAgg$id <- as.integer(VerbAgg$id)
lltm <- glmer2stan(r2 ~ -1 + btype + mode + situ + (1 | id), data=VerbAgg,
  family="binomial", calcDIC=FALSE, sample=FALSE)

The problem is that with the intercept removed (-1), all 3 levels of btype should be included in the data block, but glmer2stan only includes 2 levels. (It correctly uses all 3 levels in the parameters and model blocks.)