Thank you for implementing aldvmm in R! Is it possible to estimate c

Dear Isa Maria Thanks a lot for using the <a href="https://cran.r-pr

Dear Achim <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

Dear <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

Mark <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

Dear Achim <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

Thanks for the example, Mark <a class="user-mention notranslate" data-hovercard-type="

So far I have put together the following standard methods. The <code class="notranslat

Thanks for the update, that makes sense! <a class="user-mention notr

Comments (14)

pletschm commented on June 11, 2024 2

Dear Isa Maria

Thanks a lot for using the aldvmm package and for reaching out.

Unfortunately, the current implementation of the aldvmm package does not include an option to compute clustered standard errors. While STATA allows the calculation of various variance estimators in a standardized manner using the option vce, the calculation of robust and clustered standard errors in R is typically done in an additional step after model fitting.

I recommend using the sandwich package which can use the inverse Hessian and the gradient of the log-likelihood function at each observation to calculate clustered standard errors. In the case of the aldvmm package, these arguments can be approximated using numDeriv::grad and numDeriv::hessian. Please see the following answer to a question on Stackoverflow by Achim Zeileis (@zeileis), the author of the sandwich package.

I am sorry there is no standard option for clustered standard errors in the aldvmm package and wish you all the best of luck with your project.

Best

Mark

from aldvmm.

pletschm commented on June 11, 2024 2

Dear Achim @zeileis

Thanks a million for your help and the quick turnaround!

The reason for the wrong gradient matrix in the original example is that the default optimization method "Nelder-Mead" does not use numerical gradients during optimization and converged at a local optimum. Convergence issues are common with aldvmm models, unfortunately. Once we use a "BFGS" algorithm for model fitting, the gradient function looks plausible, and the sandwich standard errors are very similar to the conventional standard errors. I updated the example above, but for completeness, you can find a complete example in the code below.

I will consider updating the package to include data in the fit object and accommodate sandwich estimators in a standardized manner in the future.

Thanks a lot for your help and all the best.

library("aldvmm")
library("sandwich")

data("utility", package = "aldvmm")

# Fit model 
#----------

fit <- aldvmm(eq5d ~ age | 1,
              data = utility,
              psi = c(0.883, -0.594),
              ncmp = 2,
              optim.method = "BFGS")

# Augment fit object with $data for model.matrix() and estfun() methods
fit$data <- utility

# Re-define sandwich functions
#-----------------------------

coef.aldvmm <- function(object, ...) object$coef

vcov.aldvmm <- function(object, ...) {
  vc <- object$cov
  rownames(vc) <- colnames(vc) <- names(object$coef)
  return(vc)
}

nobs.aldvmm <- function(object, ...) object$n

model.matrix.aldvmm <- function(object, ...) {
  aldvmm.mm(data = object$data, formula = object$formula, ncmp = object$k, lcoef = object$label$lcoef)
}

bread.aldvmm <- function(x, ...) vcov(x) * nobs(x)

estfun.aldvmm <- function(x, ...) {
  X <- model.matrix(x)
  ef <- lapply(1:nobs(x), function(i) {
    numDeriv::jacobian(
      func = function(z) aldvmm::aldvmm.ll(
        par = z,
        X = lapply(X, function (m) t(as.matrix(m[i, ]))),
        y = x$pred$y[i],
        psi = x$psi,
        ncmp = x$k,
        dist = x$dist,
        lcoef = x$label$lcoef,
        lcpar = x$label$lcpar,
        lcmp = x$label$lcmp,
        optim.method = x$optim.method),
      x = x$coef
    )})
  ef <- do.call("rbind", ef)
  colnames(ef) <- names(x$coef)
  return(ef)
}

# Compare variance-covariance estimators
#---------------------------------------

cbind(hessian = sqrt(diag(vcov(fit))), 
      sandwich = sqrt(diag(sandwich(fit))),
      cluster = sqrt(diag(vcovCL(fit, cluster = utility$female))))

from aldvmm.

pletschm commented on June 11, 2024 2

Dear @IsaMariaSteiner and @zeileis

With the update to version 0.8.6, available on CRAN today, the package aldvmm includes all methods needed to supply aldvmm model fits to sandwich::sandwich() and sandwich::vcovCL().

Please see the appendix of the vignette for an example.

Further improvements to the workflow and the reporting were made based on @zeileis' feedback.

Thank you both very much for your feedback and your help.

Best

Mark

from aldvmm.

zeileis commented on June 11, 2024 1

Mark @pletschm, maybe you can show how to set up the log-likelihood function for a fitted model object? Possibly along with evaluating the numDeriv::grad and numDeriv::hessian? Based on that I'm happy to have a look at getting the sandwich machinery working.

from aldvmm.

pletschm commented on June 11, 2024 1

Dear Achim @zeileis

Thanks a lot for offering your support! Please find an example of the numerical estimation of the gradient matrix and the Hessian below. The derivation of the gradient matrix is a bit clumsy. I apologize, maybe there is a more efficient way.

I hope this example is useful for exploring clustered standard errors using female sex as a cluster.

Thanks a lot for your kind help.

library(aldvmm)

data(utility)

formula <- eq5d ~ age + female | 1
data <- utility
psi <- c(0.883, -0.594)
ncmp <- 2

# Fit model 
#----------

fit <- aldvmm(formula,
              data = data,
              psi = psi,
              ncmp = ncmp,
              optim.method = "BFGS")

# Create model matrix
#--------------------

mm <- aldvmm.mm(data = data,
                formula = fit$formula,
                ncmp = fit$k,
                lcoef = fit$label$lcoef)

# Create matrix of gradients per observation
#-------------------------------------------

grad <- do.call(rbind, 
                lapply(1:length(fit$pred$y), function(i) {
                  numDeriv::jacobian(func = function(z) {
                    
                    aldvmm::aldvmm.ll(par = z,
                                      X = lapply(mm, function (m) {
                                        t(as.matrix(m[i, ]))
                                      }),
                                      y = fit$pred$y[i], 
                                      psi = fit$psi,
                                      ncmp = fit$k,
                                      dist = fit$dist,
                                      lcoef = fit$label$lcoef,
                                      lcpar = fit$label$lcpar,
                                      lcmp = fit$label$lcmp,
                                      optim.method = fit$optim.method)
                    
                  },
                  x = fit$coef)
                })
)

colnames(grad) <- names(fit$coef)

# Create Hessian
#---------------

# De-novo approximation
H <- numDeriv::hessian(func = aldvmm.ll,
                       x = fit$coef,
                       X = mm,
                       y = fit$pred$y,
                       psi = fit$psi,
                       ncmp = fit$k,
                       dist = fit$dist,
                       lcoef = fit$label$lcoef,
                       lcmp = fit$label$lcmp,
                       lcpar = fit$label$lcpar,
                       optim.method = fit$optim.method)

# Inverse covariance matrix
#H_hat <- solve(fit$cov)

rownames(H) <- names(fit$coef)
colnames(H) <- names(fit$coef)

from aldvmm.

zeileis commented on June 11, 2024 1

Thanks for the example, Mark @pletschm ! I made some progress implementing some methods for sandwich (see below) but the gradient computations do not seem to be correct. I get:

colSums(grad)
##  Comp1_beta_(Intercept)          Comp1_beta_age       Comp1_beta_female 
##             -37.4588685            -622.0484978              20.6440828 
##  Comp2_beta_(Intercept)          Comp2_beta_age       Comp2_beta_female 
##             -69.1730400           -2174.8841344              25.7471103 
## Comp1_delta_(Intercept)           Comp1_lnsigma           Comp2_lnsigma 
##               0.2651438              16.0643070              -0.3977486

I would have expected that the gradient should be close to zero for all parameters. Currently this doesn't seem to be the case.

(In contrast, the numDeriv::hessian() closely replicates the $hessian from the fitted model object.)

from aldvmm.

zeileis commented on June 11, 2024 1

So far I have put together the following standard methods. The coef(), vcov(), and nobs() could easily be included in the package itself because they just use the information that is already available in the object anyway. The model.matrix() also needs the data and hence does not work out of the box. I have solved this by simply augmenting the model object with a $data element.

Extract estimated coefficients:

coef.aldvmm <- function(object, ...) object$coef

Corresponding variance-covariance matrix:

vcov.aldvmm <- function(object, ...) {
  vc <- object$cov
  rownames(vc) <- colnames(vc) <- names(object$coef)
  return(vc)
}

Number of observations:

nobs.aldvmm <- function(object, ...) object$n

List of model matrices (requires object augmented with $data):

model.matrix.aldvmm <- function(object, ...) {
  aldvmm.mm(data = object$data, formula = object$formula, ncmp = object$k, lcoef = object$label$lcoef)
}

For sandwich, extract bread matrix:

bread.aldvmm <- function(x, ...) vcov(x) * nobs(x)

For sandwich, approximate gradient (does not work yet, I think):

estfun.aldvmm <- function(x, ...) {
  X <- model.matrix(x)
  ef <- lapply(1:nobs(x), function(i) {
    numDeriv::jacobian(
      func = function(z) aldvmm::aldvmm.ll(
        par = z,
        X = lapply(X, function (m) t(as.matrix(m[i, ]))),
        y = x$pred$y[i], 
        psi = x$psi,
        ncmp = x$k,
        dist = x$dist,
        lcoef = x$label$lcoef,
        lcpar = x$label$lcpar,
        lcmp = x$label$lcmp,
        optim.method = x$optim.method),
      x = x$coef
    )})
  ef <- do.call("rbind", ef)
  colnames(ef) <- names(x$coef)
  return(ef)
}

from aldvmm.

zeileis commented on June 11, 2024 1

With the additional methods above you can then try to compare the standard variance-covariance matrix (based on the Hessian) with the (unclustered) sandwich matrix. Currently, these are rather different, possibly because the gradient/estfun does not match properly (see above).

## package and data
library("aldvmm")
data("utility", package = "aldvmm")

## fit model object
fit <- aldvmm(eq5d ~ age + female | 1, data = utility, psi = c(0.883, -0.594), ncmp = 2)

## augment with $data for model.matrix() and estfun() methods
fit$data <- utility

## compare variance-covariance estimators
library("sandwich")
cbind(hessian = sqrt(diag(vcov(fit))), sandwich = sqrt(diag(sandwich(fit))))
##                             hessian    sandwich
## Comp1_beta_(Intercept)  0.183953589 0.875496684
## Comp1_beta_age          0.001410632 0.001748473
## Comp1_beta_female       0.179863260 0.917377868
## Comp2_beta_(Intercept)  0.073323218 0.112740885
## Comp2_beta_age          0.001442085 0.002720220
## Comp2_beta_female       0.051264742 0.143497273
## Comp1_delta_(Intercept) 0.632233599 3.174625142
## Comp1_lnsigma           0.349598557 1.799327200
## Comp2_lnsigma           0.218654246 0.898138187

from aldvmm.

zeileis commented on June 11, 2024 1

Thanks for the update, that makes sense!

@IsaMariaSteiner if you want to use a clustered covariance matrix you can do so along the following lines:

utility$id <- rep(1:20, each = 10)
vcovCL(fit, utility$id)

@pletschm if you want to improve the package, I would recommend:

Use optim.method = "BFGS" as the default. In my experience this often works better than Nelder-Mead (especially if analytical gradients are available). If BFGS by itself is too unreliable, then you might first run Nelder-Mead and subsequently BFGS for the "fine-tuning".
Add some more standard methods for aldvmm objects, especially a print() method and the coef(), vcov(), and nobs() from above.
For better supporting data extraction, adopt a standard workflow with terms and model.frame and then provide the corresponding methods. For properly handling multi-part formulas, check out the Formula package and the packages that leverage it (e.g., ivreg for one of my relatively recent examples).

from aldvmm.

zeileis commented on June 11, 2024 1

Yes, exactly. In the example above, the following works (based on the artificially added id variable):

coeftest(fit, vcov = vcovCL, cluster = utility$id)

As the computation of the variance-covariance matrix is relatively slow you may want to pre-compute and store it. You can do this via:

vc <- vcovCL(fit, cluster = utility$id)
coeftest(fit, vcov = vc)

This yields exactly the same output but splits up the steps into one long and one quick computation.

from aldvmm.

IsaMariaSteiner commented on June 11, 2024

Dear Mark,

thank you very much for replying to my question and also thank you @zeileis for approving the answer.

While the approach seems to be intuitive for you, I am having difficulties implementing it (sorry, I am not a statistician).

Would you mind giving a reproducible example, for example for this case:

library(aldvmm)
library(sandwich)
library(lmtest)
library(numDeriv)

#############################################

Data

############################################
data(utility)
utility$subject_id <- 1:200

t2 <- utility
t2$eq5d <- utility$eq5d + rnorm(200, sd = 0.2)

t2$eq5d[t2$eq5d < -0.594] <- -0.594
t2$eq5d[t2$eq5d > 0.883 & t2$eq5d < 1] <- 0.883
t2$eq5d[t2$eq5d > 1] <- 1

utility_long <- rbind(utility, t2)

#############################################

Model

############################################

fit <- aldvmm(eq5d ~ age + female | 1,
data = utility_long,
psi = c(0.883, -0.594),
ncmp = 2)

Best regards,
Isa

from aldvmm.

pletschm commented on June 11, 2024

Thanks a lot Achim @zeileis for your help and the suggestions.

I am very grateful for such recommendations and will implement them in the next update. I am also planning to update the vignette with an example of clustered or robust standard errors. I wasn't aware of the Formula package. Thanks for that.

Since @IsaMariaSteiner reacted with a thumbs up, I am closing this issue.

from aldvmm.

IsaMariaSteiner commented on June 11, 2024

Thanks a lot @pletschm and @zeileis for taking the effort!
I assume that cluster robust standard errors are then as usual obtained with the coeftest command from the lmtest package.

from aldvmm.

IsaMariaSteiner commented on June 11, 2024

Thank you, this helps a lot!

from aldvmm.

Cluster-robust standard errors for variance-covariance matrix about aldvmm HOT 14 CLOSED

Comments (14)

Data

Model

Related Issues (2)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent