Code Monkey home page Code Monkey logo

Comments (7)

ASKurz avatar ASKurz commented on May 29, 2024 1

Hey Jonathan,

Thanks for your interest and for sharing your work! A couple other folks have been working with me on this issue, too. I don’t have time to look at your results in detail, at the moment. But I look forward to comparing your method with that from my other collaborators soon.

from statistical_rethinking_with_brms_ggplot2_and_the_tidyverse.

paul-buerkner avatar paul-buerkner commented on May 29, 2024

Try out something like

brm(total_tools ~ gp(x) + logpop, ...)

where x is the position variable from which Dmat was constructed. Not sure if this is really equivalent, but it might be a start.

from statistical_rethinking_with_brms_ggplot2_and_the_tidyverse.

ASKurz avatar ASKurz commented on May 29, 2024

Thanks for the suggestion! I'll play with it in a bit and see how it goes.

from statistical_rethinking_with_brms_ggplot2_and_the_tidyverse.

ASKurz avatar ASKurz commented on May 29, 2024

In these data, lat = latitude and lon2 = longitude.

Here's my current attempt:

b13.7 <- 
  brm(data = d, family = poisson,
      total_tools ~ 1 + gp(lat, lon2) + logpop,
      prior = c(set_prior("normal(0, 10)", class = "Intercept"),
                set_prior("normal(0, 1)", class = "b"),
                set_prior("cauchy(0, 1)", class =	"lscale"),
                set_prior("cauchy(0, 1)", class = "sdgp")),
      iter = 1e4, warmup = 2000, chains = 4, cores = 4,
      control = list(adapt_delta = 0.999,
                     max_treedepth = 12))

Using posterior_summary(b13.7) %>% round(digits = 2), the results were:

                  Estimate Est.Error 2.5%ile 97.5%ile
b_Intercept           1.42      1.13   -0.84     3.74
b_logpop              0.24      0.11    0.02     0.45
sdgp_gplatlon2        0.52      0.36    0.16     1.42
lscale_gplatlon2      0.23      0.13    0.07     0.56
zgp_gplatlon2[1]     -0.59      0.79   -2.18     0.96
zgp_gplatlon2[2]      0.44      0.84   -1.25     2.08
zgp_gplatlon2[3]     -0.61      0.71   -1.98     0.91
zgp_gplatlon2[4]      0.88      0.70   -0.46     2.29
zgp_gplatlon2[5]      0.24      0.75   -1.25     1.73
zgp_gplatlon2[6]     -1.01      0.79   -2.55     0.59
zgp_gplatlon2[7]      0.13      0.71   -1.36     1.52
zgp_gplatlon2[8]     -0.18      0.87   -1.89     1.55
zgp_gplatlon2[9]      0.41      0.91   -1.49     2.14
zgp_gplatlon2[10]    -0.32      0.83   -1.94     1.33
lp__                -51.52      3.17  -58.77   -46.42

For a point of reference, here's the rethinking model output using precis(m13.7, prob = .95, depth = 2) :

       Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
g[1]  -0.25   0.45      -1.22       0.63  3352    1
g[2]  -0.10   0.44      -1.01       0.79  3114    1
g[3]  -0.15   0.43      -1.08       0.65  3102    1
g[4]   0.31   0.38      -0.48       1.09  3127    1
g[5]   0.04   0.38      -0.72       0.82  3064    1
g[6]  -0.44   0.39      -1.28       0.25  3270    1
g[7]   0.12   0.37      -0.67       0.85  3056    1
g[8]  -0.25   0.38      -1.03       0.49  3121    1
g[9]   0.25   0.36      -0.47       0.98  2794    1
g[10] -0.11   0.47      -1.02       0.85  4073    1
a      1.27   1.17      -1.15       3.62  4790    1
bp     0.25   0.12       0.01       0.48  6140    1
etasq  0.36   0.70       0.00       1.13  6334    1
rhosq  1.87  25.20       0.00       4.36 14061    1

My questions:

Here's the equation from the brms manual:
$k(x_{i},x_{j}) = sdgp^2exp(-||x_{i} - x_{j}||/2lscale^2)$

Here's the corresponding one from the Statistical Rethinking text (page 412):
$\mathbf{K}{ij} = \eta^2exp(-\rho^2D^2{ij}) + \delta_{ij}\sigma^2$

In the text, McElreath reports $\delta_{ij}\sigma^2$ drops out when we only have one observation per location, so we don't need to conisder it further in this example.

What you call $sdgp$, McElreath called $\eta$. While brms appears to estimate $sdgp$ (i.e., sdgp(gplatlon2) in my output), McElreath estimated $\eta^2$ in the text, which he called etasq in the code.

It appears that what you expressed as $||x_{i} - x_{j}||$ (i.e., the Euclidian norm), McElreath simply called $D_{ij}$, though the parameterization he used in the text includes its square, $D^2_{ij}$.

Question: Am I following correctly that $||x_{i} - x_{j}||$ (i.e., the Euclidian norm), equivalent to McElreath's $D^2_{ij}$?

Also, you multiply $||x_{i} - x_{j}||$ by $-1/2lscale^2$, in which $lscale$ is what you call the "characteristic length-scale parameter." At this point in his equation, McElreath simply multiplies $D^2_{ij}$ by $-\rho^2$, where $\rho$ is what he described as the "rate of decline."

In my brms output, I get lscale(gplatlon2), which gets squared in the equation. In contrast, McElreath appears to estimate $\rho^2$ directly, what he called rhosq in his code.

Question: Am I following correctly that your $lscale$ should correspond to $\sqrt{1/(2*\rho^2)}$?

from statistical_rethinking_with_brms_ggplot2_and_the_tidyverse.

paul-buerkner avatar paul-buerkner commented on May 29, 2024

Answer 1: Indeed D_ij is the eucledian distance.

Answer 2: Looks correct to me. Please not that brms uses some special priors on lscale by default as explained in https://betanalpha.github.io/assets/case_studies/gp_part3/part3.html

from statistical_rethinking_with_brms_ggplot2_and_the_tidyverse.

ASKurz avatar ASKurz commented on May 29, 2024

That's helpful. Thanks Paul!

from statistical_rethinking_with_brms_ggplot2_and_the_tidyverse.

jonnations avatar jonnations commented on May 29, 2024

Hi Solomon,
I've been looking at the gp() functions in the context of the phylogenetic models in Statistical Rethinking v2. Your examples in the brms markdown book are some of the only ones out there, and I've been looking closely at the models 13.7 (b13.7) and m13.7.

My interest in statistics is strong but my skills are weak, and my algebra skills are even
worse. I really hope I'm not totally misleading you here. I did notice though that the parameter lscale_gplatlong2 was squared somewhere in the model. It started as class = lscale and ended as lscale_gplatlong2. Looking back at your code above, and in the text, I was wondering if squaring it a second time may have caused some of the discrepancies?

First, I ran b13.7 and m13.7 exactly as you did in the book. I didn't use the non-centered version of the m13.7 model.

In your book (not sure the best way to reference locations in a bookdown!), you estimated the brms version of rho_squared as

  transmute(rho_squared = 1 / (2 * lscale_gplatlon2^2)) %>% 
  mean_hdi(rho_squared, .width = .89) %>% 
  mutate_if(is.double, round, digits = 3)

This gave a number of 21.4. However, as lscale was already squared somewhere in the model, when I do the same with

  #below is the difference
  transmute(rho_squared = 1 / (2 * lscale_gplatlon2)) %>% 
  mean_hdi(rho_squared, .width = .89) %>% 
  mutate_if(is.double, round, digits = 3)

I get an result (2.88) that resembles McElreath's result of 2.29.

The same goes for the "holy smokes!" plot below. When I changed it to

post_m13.7[, "rhosq"] %>%
 bind_rows(post_b13.7%>% 
            #again, not squaring lscale_gplatlon2
             transmute(rhosq = 1 / (2 * (lscale_gplatlon2)))
           ) %>% 
 mutate(package = rep(c("rethinking", "brms"), each = nrow(post_m13.7))) %>% 
  
  ggplot(aes(x = rhosq, fill = package)) +
  geom_density() +
  scale_fill_manual(NULL, values = c("#80A0C7", "#A65141")) +
  labs(title    = "Are these the same? Maybe? I'm not sure...",
       x  = expression(rho^2)) +
  xlim(c(0,15)) +
  #theme_pearl_earring +
  theme(legend.position = "none")

They look similar, like simple discrepancies in the priors and likelihoods may have generated the differences.
000035

It's maybe worth noting that the tail of rethinking rho_sq goes WAY out into the 1000's, so that's why the mean is 2.8 even though the mode is around 0.1.

I hope this can help. Thanks for all the hard work you have put into your book and your blog!

from statistical_rethinking_with_brms_ggplot2_and_the_tidyverse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.