Code Monkey home page Code Monkey logo

Comments (9)

tyarkoni avatar tyarkoni commented on May 30, 2024

Thanks for the pointer, this notebook is great!

We haven't done any comparisons ourselves, since we're not doing any modeling outside of PyMC3. One of the hopes of this project is that other people will help contribute other backends, so that the same model could be fit in multiple packages using the same interface. That would then make it very easy to add comparison tools that address precisely this kind of issue.

from bambi.

jake-westfall avatar jake-westfall commented on May 30, 2024

I've done a handful of comparisons and verified that bambi w/ default priors seemed to agree well with lme4 in those cases, but I haven't done anything systematic. One thought is that, unlike for models without random effects, where we would expect very good agreement between the Bayesian and OLS regressions, it's not clear in general how much agreement we should expect once we add random effects. The reason (I think) is that unless you have large numbers of clusters for all your random factors, the random effect variances can be quite hard to estimate precisely, and it seems likely that likelihood-optimization and MCMC sampling would generally start to diverge in their estimates of the random effects, which will often in turn influence the fixed effects to some degree. Certainly we expect the results of the two approaches to not be wildly different, but I suspect that in many cases they will be a bit different, and that this is probably a general feature of Bayesian vs. ML estimation in mixed models.

from bambi.

junpenglao avatar junpenglao commented on May 30, 2024

@tyarkoni I am glad you find it helpful ;-) Another package call Edward is also following a similar idea regarding using different backends. You might find it interesting.

@jake-westfall That's a very good insight and I agree as well that Bayesian and OLS estimation is not necessarily identical. However, what puzzles me is even when I am directly comparing the coefficient estimated from Stan and PyMC3, the output is still a bit different. Of course, all the outputs display the same pattern.
I would like also to add that the model in I used in Stan and PyMC3 is not exactly identical. I did not use the same prior, and I used the MCMC instead of NUTS in PyMC3 (too slow using NUTS).
I think with more simulation and more data, it will become clear. So please keep me posted if you have more information on this!

from bambi.

tyarkoni avatar tyarkoni commented on May 30, 2024

NUTS is much slower but Metropolis often fails to converge, so it's possible that's the issue. If you're willing to wait it out, you could try it again with NUTS and see if you get closer correspondence.

from bambi.

tyarkoni avatar tyarkoni commented on May 30, 2024

Closing this, since it isn't a Bambi issue per se, but thanks for bringing it up, @junpenglao, and please post an update if you re-run with the NUTS sampler!

from bambi.

junpenglao avatar junpenglao commented on May 30, 2024

Thank you for the suggestion @tyarkoni. I did rerun the model with NUTS. Actually, the problem is not NUTS being too slow - in my case, it's not sampling properly (flat trace or converge to a weird value).
And MCMC is likely already converged, I purposely set a very large sampling (100000) and the trace looks OK. Again, it is possible that this is not a common case.
One last note, although the model coefficients are not identical, they all fitted the data very good. You can see the plot in the notebook. STAN gives the smallest ||y - y_hat||, but i dont think the advantage is significant.

from bambi.

junpenglao avatar junpenglao commented on May 30, 2024

Just a small update: I tried Bambi on my data, and the output is almost identical to lme4 and STAN. Do you have a paper or reference explaining your implementation? Looking at the model.terms.values() it seems some of the hyperparameters in the prior is set using an Empirical Bayes approach (e.g., the mu of the intercept, the sd for the fixed effect in my model)?

from bambi.

jake-westfall avatar jake-westfall commented on May 30, 2024

I just added a notebook (still a WIP) that describes the default priors at least for the fixed effects.

The theory/motivation behind the default priors is that they are set up "as if" the priors had been placed on the partial correlation coefficients between the outcome and the predictors. I say "as if" because the priors are not literally placed on the partial correlations -- directly they are placed on the unstandardized regression coefficients -- but the parameters of the priors are chosen so that the implied prior on the partial correlation scale is sensible. Setting a prior on any standardized effect size requires looking at at least some aspect of the data, such as the standard deviation of Y. (Well, it doesn't technically require it, but it's much simpler this way.) So in that sense it is kinda-sorta Empirical Bayesian, although note that we never look directly at an estimate of the thing we're placing a prior on. Note that a similar approach (of having the priors technically depend on some aspect of the data) is taken with some of the default priors in, e.g., the brms and rstanarm packages in R.

from bambi.

junpenglao avatar junpenglao commented on May 30, 2024

I see, this is really helpful! Thank you very much!

from bambi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.