Comments (14)
Dear Isa Maria
Thanks a lot for using the aldvmm package and for reaching out.
Unfortunately, the current implementation of the aldvmm package does not include an option to compute clustered standard errors. While STATA allows the calculation of various variance estimators in a standardized manner using the option vce, the calculation of robust and clustered standard errors in R is typically done in an additional step after model fitting.
I recommend using the sandwich package which can use the inverse Hessian and the gradient of the log-likelihood function at each observation to calculate clustered standard errors. In the case of the aldvmm package, these arguments can be approximated using numDeriv::grad and numDeriv::hessian. Please see the following answer to a question on Stackoverflow by Achim Zeileis (@zeileis), the author of the sandwich package.
I am sorry there is no standard option for clustered standard errors in the aldvmm package and wish you all the best of luck with your project.
Best
Mark
from aldvmm.
Dear Achim @zeileis
Thanks a million for your help and the quick turnaround!
The reason for the wrong gradient matrix in the original example is that the default optimization method "Nelder-Mead" does not use numerical gradients during optimization and converged at a local optimum. Convergence issues are common with aldvmm models, unfortunately. Once we use a "BFGS" algorithm for model fitting, the gradient function looks plausible, and the sandwich standard errors are very similar to the conventional standard errors. I updated the example above, but for completeness, you can find a complete example in the code below.
I will consider updating the package to include data in the fit object and accommodate sandwich estimators in a standardized manner in the future.
Thanks a lot for your help and all the best.
library("aldvmm")
library("sandwich")
data("utility", package = "aldvmm")
# Fit model
#----------
fit <- aldvmm(eq5d ~ age | 1,
data = utility,
psi = c(0.883, -0.594),
ncmp = 2,
optim.method = "BFGS")
# Augment fit object with $data for model.matrix() and estfun() methods
fit$data <- utility
# Re-define sandwich functions
#-----------------------------
coef.aldvmm <- function(object, ...) object$coef
vcov.aldvmm <- function(object, ...) {
vc <- object$cov
rownames(vc) <- colnames(vc) <- names(object$coef)
return(vc)
}
nobs.aldvmm <- function(object, ...) object$n
model.matrix.aldvmm <- function(object, ...) {
aldvmm.mm(data = object$data, formula = object$formula, ncmp = object$k, lcoef = object$label$lcoef)
}
bread.aldvmm <- function(x, ...) vcov(x) * nobs(x)
estfun.aldvmm <- function(x, ...) {
X <- model.matrix(x)
ef <- lapply(1:nobs(x), function(i) {
numDeriv::jacobian(
func = function(z) aldvmm::aldvmm.ll(
par = z,
X = lapply(X, function (m) t(as.matrix(m[i, ]))),
y = x$pred$y[i],
psi = x$psi,
ncmp = x$k,
dist = x$dist,
lcoef = x$label$lcoef,
lcpar = x$label$lcpar,
lcmp = x$label$lcmp,
optim.method = x$optim.method),
x = x$coef
)})
ef <- do.call("rbind", ef)
colnames(ef) <- names(x$coef)
return(ef)
}
# Compare variance-covariance estimators
#---------------------------------------
cbind(hessian = sqrt(diag(vcov(fit))),
sandwich = sqrt(diag(sandwich(fit))),
cluster = sqrt(diag(vcovCL(fit, cluster = utility$female))))
from aldvmm.
Dear @IsaMariaSteiner and @zeileis
With the update to version 0.8.6, available on CRAN today, the package aldvmm includes all methods needed to supply aldvmm model fits to sandwich::sandwich() and sandwich::vcovCL().
Please see the appendix of the vignette for an example.
Further improvements to the workflow and the reporting were made based on @zeileis' feedback.
Thank you both very much for your feedback and your help.
Best
Mark
from aldvmm.
Mark @pletschm, maybe you can show how to set up the log-likelihood function for a fitted model object? Possibly along with evaluating the numDeriv::grad
and numDeriv::hessian
? Based on that I'm happy to have a look at getting the sandwich
machinery working.
from aldvmm.
Dear Achim @zeileis
Thanks a lot for offering your support! Please find an example of the numerical estimation of the gradient matrix and the Hessian below. The derivation of the gradient matrix is a bit clumsy. I apologize, maybe there is a more efficient way.
I hope this example is useful for exploring clustered standard errors using female sex as a cluster.
Thanks a lot for your kind help.
library(aldvmm)
data(utility)
formula <- eq5d ~ age + female | 1
data <- utility
psi <- c(0.883, -0.594)
ncmp <- 2
# Fit model
#----------
fit <- aldvmm(formula,
data = data,
psi = psi,
ncmp = ncmp,
optim.method = "BFGS")
# Create model matrix
#--------------------
mm <- aldvmm.mm(data = data,
formula = fit$formula,
ncmp = fit$k,
lcoef = fit$label$lcoef)
# Create matrix of gradients per observation
#-------------------------------------------
grad <- do.call(rbind,
lapply(1:length(fit$pred$y), function(i) {
numDeriv::jacobian(func = function(z) {
aldvmm::aldvmm.ll(par = z,
X = lapply(mm, function (m) {
t(as.matrix(m[i, ]))
}),
y = fit$pred$y[i],
psi = fit$psi,
ncmp = fit$k,
dist = fit$dist,
lcoef = fit$label$lcoef,
lcpar = fit$label$lcpar,
lcmp = fit$label$lcmp,
optim.method = fit$optim.method)
},
x = fit$coef)
})
)
colnames(grad) <- names(fit$coef)
# Create Hessian
#---------------
# De-novo approximation
H <- numDeriv::hessian(func = aldvmm.ll,
x = fit$coef,
X = mm,
y = fit$pred$y,
psi = fit$psi,
ncmp = fit$k,
dist = fit$dist,
lcoef = fit$label$lcoef,
lcmp = fit$label$lcmp,
lcpar = fit$label$lcpar,
optim.method = fit$optim.method)
# Inverse covariance matrix
#H_hat <- solve(fit$cov)
rownames(H) <- names(fit$coef)
colnames(H) <- names(fit$coef)
from aldvmm.
Thanks for the example, Mark @pletschm ! I made some progress implementing some methods for sandwich
(see below) but the gradient computations do not seem to be correct. I get:
colSums(grad)
## Comp1_beta_(Intercept) Comp1_beta_age Comp1_beta_female
## -37.4588685 -622.0484978 20.6440828
## Comp2_beta_(Intercept) Comp2_beta_age Comp2_beta_female
## -69.1730400 -2174.8841344 25.7471103
## Comp1_delta_(Intercept) Comp1_lnsigma Comp2_lnsigma
## 0.2651438 16.0643070 -0.3977486
I would have expected that the gradient should be close to zero for all parameters. Currently this doesn't seem to be the case.
(In contrast, the numDeriv::hessian()
closely replicates the $hessian
from the fitted model object.)
from aldvmm.
So far I have put together the following standard methods. The coef()
, vcov()
, and nobs()
could easily be included in the package itself because they just use the information that is already available in the object anyway. The model.matrix()
also needs the data and hence does not work out of the box. I have solved this by simply augmenting the model object with a $data
element.
Extract estimated coefficients:
coef.aldvmm <- function(object, ...) object$coef
Corresponding variance-covariance matrix:
vcov.aldvmm <- function(object, ...) {
vc <- object$cov
rownames(vc) <- colnames(vc) <- names(object$coef)
return(vc)
}
Number of observations:
nobs.aldvmm <- function(object, ...) object$n
List of model matrices (requires object augmented with $data
):
model.matrix.aldvmm <- function(object, ...) {
aldvmm.mm(data = object$data, formula = object$formula, ncmp = object$k, lcoef = object$label$lcoef)
}
For sandwich
, extract bread matrix:
bread.aldvmm <- function(x, ...) vcov(x) * nobs(x)
For sandwich
, approximate gradient (does not work yet, I think):
estfun.aldvmm <- function(x, ...) {
X <- model.matrix(x)
ef <- lapply(1:nobs(x), function(i) {
numDeriv::jacobian(
func = function(z) aldvmm::aldvmm.ll(
par = z,
X = lapply(X, function (m) t(as.matrix(m[i, ]))),
y = x$pred$y[i],
psi = x$psi,
ncmp = x$k,
dist = x$dist,
lcoef = x$label$lcoef,
lcpar = x$label$lcpar,
lcmp = x$label$lcmp,
optim.method = x$optim.method),
x = x$coef
)})
ef <- do.call("rbind", ef)
colnames(ef) <- names(x$coef)
return(ef)
}
from aldvmm.
With the additional methods above you can then try to compare the standard variance-covariance matrix (based on the Hessian) with the (unclustered) sandwich matrix. Currently, these are rather different, possibly because the gradient/estfun
does not match properly (see above).
## package and data
library("aldvmm")
data("utility", package = "aldvmm")
## fit model object
fit <- aldvmm(eq5d ~ age + female | 1, data = utility, psi = c(0.883, -0.594), ncmp = 2)
## augment with $data for model.matrix() and estfun() methods
fit$data <- utility
## compare variance-covariance estimators
library("sandwich")
cbind(hessian = sqrt(diag(vcov(fit))), sandwich = sqrt(diag(sandwich(fit))))
## hessian sandwich
## Comp1_beta_(Intercept) 0.183953589 0.875496684
## Comp1_beta_age 0.001410632 0.001748473
## Comp1_beta_female 0.179863260 0.917377868
## Comp2_beta_(Intercept) 0.073323218 0.112740885
## Comp2_beta_age 0.001442085 0.002720220
## Comp2_beta_female 0.051264742 0.143497273
## Comp1_delta_(Intercept) 0.632233599 3.174625142
## Comp1_lnsigma 0.349598557 1.799327200
## Comp2_lnsigma 0.218654246 0.898138187
from aldvmm.
Thanks for the update, that makes sense!
@IsaMariaSteiner if you want to use a clustered covariance matrix you can do so along the following lines:
utility$id <- rep(1:20, each = 10)
vcovCL(fit, utility$id)
@pletschm if you want to improve the package, I would recommend:
- Use
optim.method = "BFGS"
as the default. In my experience this often works better than Nelder-Mead (especially if analytical gradients are available). If BFGS by itself is too unreliable, then you might first run Nelder-Mead and subsequently BFGS for the "fine-tuning". - Add some more standard methods for
aldvmm
objects, especially aprint()
method and thecoef()
,vcov()
, andnobs()
from above. - For better supporting data extraction, adopt a standard workflow with
terms
andmodel.frame
and then provide the corresponding methods. For properly handling multi-part formulas, check out theFormula
package and the packages that leverage it (e.g.,ivreg
for one of my relatively recent examples).
from aldvmm.
Yes, exactly. In the example above, the following works (based on the artificially added id
variable):
coeftest(fit, vcov = vcovCL, cluster = utility$id)
As the computation of the variance-covariance matrix is relatively slow you may want to pre-compute and store it. You can do this via:
vc <- vcovCL(fit, cluster = utility$id)
coeftest(fit, vcov = vc)
This yields exactly the same output but splits up the steps into one long and one quick computation.
from aldvmm.
Dear Mark,
thank you very much for replying to my question and also thank you @zeileis for approving the answer.
While the approach seems to be intuitive for you, I am having difficulties implementing it (sorry, I am not a statistician).
Would you mind giving a reproducible example, for example for this case:
library(aldvmm)
library(sandwich)
library(lmtest)
library(numDeriv)
#############################################
Data
############################################
data(utility)
utility$subject_id <- 1:200
t2 <- utility
t2$eq5d <- utility$eq5d + rnorm(200, sd = 0.2)
t2$eq5d[t2$eq5d < -0.594] <- -0.594
t2$eq5d[t2$eq5d > 0.883 & t2$eq5d < 1] <- 0.883
t2$eq5d[t2$eq5d > 1] <- 1
utility_long <- rbind(utility, t2)
#############################################
Model
############################################
fit <- aldvmm(eq5d ~ age + female | 1,
data = utility_long,
psi = c(0.883, -0.594),
ncmp = 2)
Best regards,
Isa
from aldvmm.
Thanks a lot Achim @zeileis for your help and the suggestions.
I am very grateful for such recommendations and will implement them in the next update. I am also planning to update the vignette with an example of clustered or robust standard errors. I wasn't aware of the Formula
package. Thanks for that.
Since @IsaMariaSteiner reacted with a thumbs up, I am closing this issue.
from aldvmm.
Thanks a lot @pletschm and @zeileis for taking the effort!
I assume that cluster robust standard errors are then as usual obtained with the coeftest
command from the lmtest
package.
from aldvmm.
Thank you, this helps a lot!
from aldvmm.
Related Issues (2)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aldvmm.