nhejazi / medshift Goto Github PK

9.0 4.0 3.0 803 KB

:package: :game_die: R/medshift: Causal Mediation Analysis for Stochastic Interventions

Home Page: https://codex.nimahejazi.org/medshift

License: Other

Makefile 0.55% TeX 2.45% R 97.00%

causal-inference mediation-analysis treatment-effects stochastic-interventions machine-learning targeted-learning inverse-probability-weights

medshift's Introduction

hello, i'm Nima

I'm an academic (bio)statistician working at the interface of causal inference, machine learning, and non- and semi-parametric statistics. I'm passionate about building open-source software tools to improve the accessibility of modern, model-agnostic and assumption-lean methods for statistical inference and causal machine learning, and I'm especially excited by the applications of statistics to the biomedical and public health sciences.

Are you looking for open source software for targeted causal machine learning? Maybe you should check out the tlverse project and browse our free open-source handbook!

medshift's People

Contributors

Stargazers

Watchers

Forkers

yongfuyu zzruc kennethataylor

medshift's Issues

test indexing approaches

A test should be written to ensure that indexing is done correctly for Dy, that is, A * component and (1 - A) * component should be sufficient for indexing

Support for continuous-valued exposures

Some re-organization of the package contents may be necessary to begin accommodating continuous interventions.
Some machinery can (probably) be easily borrowed from txshift.

stochastic interventional (in)direct effects

As of 3e65185, support for the stochastic interventional effects (see methods manuscript at https://arxiv.org/abs/2009.06203) is included in a helper package, https://github.com/nhejazi/medshift/tree/master/sandbox/intmedlite. These estimators can be integrated into the package with only relatively minor changes to the code; however, for accessibility, this integration should be prioritized.

Functionality for true continuous treatment

Hey Nima,

Is there work underway to make this package work for continuous treatment? I've tested it under simple continuous A situations and get errors related to eif_component_names for onestep estimation and estimator_args[["max_iter"]] error for tmle.

propensity score truncation

In the case of a binary intervention, both of the nuisance parameters G = P(A | W) and E = P(A | Z, W) are propensity scores, which may be susceptible to instability in the case of (near) practical violations of the assumption of positivity. It would be best to implement a flexible approach to automatically truncate estimated propensity scores, perhaps by default to the range (0.01, 0.99).

Scaling transformation of outcome variable

It's generally the case that a continuous-valued outcome variable is re-scaled to fall in the interval [0, 1] via the transformation Y_scaled = Y - min(Y) / max(Y) - min(Y) for the purposes of estimation. Upon completion of the estimation procedure, the results should then be back-transformed to the original scale. A pair of functions for performing this transformation should be implemented. Note that, once this change is made, it will be necessary to edit the sl3_Task objects for each nuisance parameter regression task to manually specify family = "quasibinomial" in order to indicate that the Y values are not truly binary (\in {0, 1}) but rather simply fall in the interval.

Check that CV-folds is greater than 1

There should be a check, implemented with assertthat::assert_that(), to make sure that the number of folds specified for cross-validation is greater than 1. In the current implementation, origami::cross_validate will fail for V = 1 with an ambiguous/confusing error message, due to the way in which make_folds generates the structure of class folds. There's really no good reason that the case V = 1 needs to be supported, since using cross-validation / cross-fitting in constructing the AIPW estimator has theoretical benefits anyway.

allow ensemble learning for phi

Allow option for fitting the nuisance parameter Phi via arbitrary algorithms, as already implemented for other nuisance parameters

Multiple Mediators

Hi Nima,

First of all, thank you for your amazing work!
I am trying to decompose the effect of multiple mediators ideally using xgboost. On the documentation, I see that medshift should be able to work with multiple mediators but I keep getting the estimation of only one mediator, even using the default example.

Best wishes,
Ahmed


library(medshift)
library(data.table)

make_simple_mediation_data <- function(n_obs = 1000) {
  W <- rbinom(n_obs, 1, prob = 0.50) 
  A <- as.numeric(rbinom(n_obs, 1, prob = W / 4 + 0.1))
  z1_prob <- 1 - plogis((A^2 + W) / (A + W^3 + 0.5))
  Z <- rbinom(n_obs, 1, prob = z1_prob)
  Y <- Z + A - 0.1 * W + rnorm(n_obs, mean = 0, sd = 0.25)
  
  data <- as.data.table(cbind(Y, Z, A, W))
  setnames(data, c("Y", "Z", "A", "W"))
  return(data)
}
set.seed(75681)
example_data <- make_simple_mediation_data()
example_data$ZZ<-sample(c(0,1),dim(example_data)[1],replace = TRUE)

os_medshift <- medshift(W = example_data$W, A = example_data$A,
                        Z = cbind(example_data$Z,example_data$ZZ), Y = example_data$Y,
                        delta = 3, 
                        g_learners =sl3::Lrnr_xgboost$new() ,
                        e_learners =sl3::Lrnr_xgboost$new() ,
                        m_learners =sl3::Lrnr_xgboost$new() ,
                        phi_learners =sl3::Lrnr_xgboost$new() ,
                        estimator = "onestep",
                        estimator_args = list(cv_folds = 3))
summary(os_medshift)

utility function for IP weights

used in the re-weighted estimator and the efficient estimator, should just be a utility function rather than manual computation

Arbitrary fold structures for one-step estimator

Currently, this line in the function est_onestep, which implements the one-step estimator, restricts the use of origami::make_folds. This should be generalized to allow the number of folds and the specific fold function used to be set arbitrarily by the user. A default might be folds <- origami::make_folds(data, fold_fun = folds_vfold, V = 10).

clearer documentation and naming

The documentation and organization of some parts of the package leave something to be desired. In particular, several new functions require documenting. Also, the naming of the estimators should be slightly revised for clarity --- e.g., "reweighted" -> "ipw", "efficient" -> "onestep"

weight stabilization

Note that the expectation of the weights g / e is equal to one. A good way to stabilize the AIPW estimator is to divide the weights by their empirical sample average. We should implement this.

TODO

TMLE for the binary intervention case (should be easy since the exponential tilt defining D_A gives it a nice form --- no integration needed, unlike the case for continuous A) --- should do this using tmle3
Re-organization of package contents to begin accommodating continuous interventions; some machinery can be easily borrowed from txshift

substitution estimator should use counterfactuals

In https://github.com/nhejazi/medshift/blob/master/R/est_sub.R#L36-L44, the substitution estimator uses predicted values for the sub-groups A = 1 and A = 0, it should use counterfactual values similar to how Dzw is computed for the efficient estimator

problems in efficient estimator

https://github.com/nhejazi/medshift/blob/master/R/cv_eif.R#L117 uses the predicted values for E, given A = 1, as those for the A = 0 group incorrectly. This should be (1 - E) for those observations.
the stabilized AIPW is computed incorrectly, note that E(g/e) = 1 is not a conditional expectation, thus https://github.com/nhejazi/medshift/blob/master/R/cv_eif.R#L123-L124 should use a mean over the re-constructed vector of weights

missing outcome support

We should implement a procedure that estimates the full data parameter in the presence of a censoring process, e.g., a data structure like O = (W, A, Z, C, CY), for censoring indicator C. Such an approach would be based on the joint intervention setting C = 1 and the joint intervention on {A, Z} that defines our causal parameters. The estimation procedures would then simply incorporate an extra set of IP weights, specifically to address this intervention, i.e., 1/g(C = 1 | …).

Support for observation-level IDs

In some applications, it may be useful to support the presence of hierarchical structures in which individual units belong to clusters (e.g., families, hospitals, schools). To support valid estimation and inference in such settings, observation-level IDs must be passed to nuisance regression estimators (such that cross-validation respects these) and to the inferential machinery (averaging EIF estimates at the cluster level).

utility function for Dzw/substitution

The procedure to compute the Dzw component of the EIF and the substitution estimator is nearly identical (with the substitution estimator simply requiring an empirical mean be taken over the Dzw vector for that component of the EIF values). This should be abstracted into a single utility function to be used in both situations.

TML estimator for binary interventions

TMLE for the binary intervention case should be easy since the exponential tilt defining D_A gives it a nice form (no integration needed, unlike the case for continuous A). We should implement this using the framework exposed in tmle3.

Nuisance parameter phi should use training data for one-step

Currently, computing the nuisance parameter phi does not make use of the training-validation split necessary for computing a cross-fitted one-step estimator. That is, phi is computed on only a single data set https://github.com/nhejazi/medshift/blob/master/R/fit_mechanisms.R#L280-L299. In computing the cross-fitted one-step estimator, only the validation data is used for phi https://github.com/nhejazi/medshift/blob/master/R/estim_helpers.R#L163-L167.

testing Monte Carlo integration

The function https://github.com/nhejazi/medshift/blob/continuous/R/utils.R#L235-L289 for Monte Carlo integration in the case of continuous A may not require re-scaling by the range of the uniform distribution (i.e., no need to use the multiplier b - a) notes @jeremyrcoyle. This should be tested --- a simple case would be to let the density mechanism be a normal distribution and the weighting mechanism to be the identity.