Comments (5)
Was this solved? I'm having the same area using dowhy.fit.
from econml.
No, I have not heard back yet
from econml.
Realized my instrument variable wasn't binary and had to be. Using Econml and DoWhy in tandem from the sample notebooks online for EconMl.
from econml.
Sorry for the slow response - a couple of thoughts:
- It would help if you could provide a simplified repro; are a significant number of units treated and untreated at each time point?
- The DynamicDML class is intended for scenarios where treatments may be repeated, so it is not necessary to keep T=1 after the time of first treatment unless the units are actually continuing to receive treatment.
from econml.
Hi @kbattocchi and @samanbanafti I am just wondering how can this issue be solved? Because I encountered the same problem when I am using Causal Forest DML with dowhy fit and set discrete treatment to be True for the treatment. My treatment is a categorical variable with category type, it has values such as "High Impact", "Medium Impact" and "Low Impact" etc. It was working when I use the model on a continuous treatment variable except it is not RandomForestClassifier and discrete treatment is False.
Code:
first_stage_reg = lambda: GridSearchCV(estimator=RandomForestRegressor(n_estimators=1000),
param_grid={
'max_depth': max_depth,
'max_features': max_features,
'min_samples_split': min_samples_split
}, cv=5, n_jobs=-1, scoring='neg_mean_squared_error'
)
first_stage_class = lambda: GridSearchCV(estimator=RandomForestClassifier(n_estimators=1000),
param_grid={
'max_depth': max_depth,
'max_features': max_features,
'min_samples_split': min_samples_split
}, cv=5, n_jobs=-1, scoring='neg_mean_squared_error'
)
model_y = first_stage_reg().fit(X, Y).best_estimator_
model_t = first_stage_class().fit(X, T).best_estimator_
est_nonparam = CausalForestDML(model_y=model_y, model_t=model_t, discrete_treatment=True, n_estimators=1000, cv=5)
est_nonparam_dw = est_nonparam.dowhy.fit(Y, T, X, W=None, groups=groups,
outcome_names=target_feature1,
treatment_names=['RegulatoryIndex'],
feature_names=Agg_df_imputed_transformed.iloc[:, ~Agg_df_imputed_transformed.columns.isin(['RegulatoryIndex']+
target_features + country_indicator)].columns.tolist(),
inference='blb')
Error:
One or more of the test scores are non-finite: [nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan]
econml has not been tested with dowhy versions >= 0.11
AttributeError Traceback (most recent call last)
Cell In[232], line 27
23 model_t = first_stage_class().fit(X, T).best_estimator_
25 est_nonparam = CausalForestDML(model_y=model_y, model_t=model_t, discrete_treatment=True, n_estimators=1000, cv=5)
---> 27 est_nonparam_dw = est_nonparam.dowhy.fit(Y, T, X, W=None, groups=groups,
28 outcome_names=target_feature1,
29 treatment_names=['RegulatoryIndex'],
30 feature_names=Agg_df_imputed_transformed.iloc[:, ~Agg_df_imputed_transformed.columns.isin(['RegulatoryIndex']+
31 target_features+
32 country_indicator)].columns.tolist(),
33 inference='blb')
File ~\AppData\Local\anaconda3\Lib\site-packages\econml\dowhy.py:180, in DoWhyWrapper.fit(self, Y, T, X, W, Z, outcome_names, treatment_names, feature_names, confounder_names, instrument_names, graph, estimand_type, proceed_when_unidentifiable, missing_nodes_as_confounders, control_value, treatment_value, target_units, **kwargs)
178 for p in self.get_params():
179 init_params[p] = getattr(self.cate_estimator, p)
--> 180 self.estimate = self.dowhy.estimate_effect(self.identified_estimand_,
181 method_name=method_name,
182 control_value=control_value,
183 treatment_value=treatment_value,
184 target_units=target_units,
185 method_params={
186 "init_params": init_params,
187 "fit_params": kwargs,
188 },
189 )
190 return self
File ~\AppData\Local\anaconda3\Lib\site-packages\dowhy\causal_model.py:360, in CausalModel.estimate_effect(self, identified_estimand, method_name, control_value, treatment_value, test_significance, evaluate_effect_strength, confidence_intervals, target_units, effect_modifiers, fit_estimator, method_params)
349 causal_estimator = causal_estimator_class(
350 identified_estimand,
351 test_significance=test_significance,
(...)
355 **extra_args,
356 )
358 self._estimator_cache[method_name] = causal_estimator
--> 360 return estimate_effect(
361 self._data,
362 self._treatment,
363 self._outcome,
364 identifier_name,
365 causal_estimator,
366 control_value,
367 treatment_value,
368 target_units,
369 effect_modifiers,
370 fit_estimator,
371 method_params,
372 )
File ~\AppData\Local\anaconda3\Lib\site-packages\dowhy\causal_estimator.py:719, in estimate_effect(data, treatment, outcome, identifier_name, estimator, control_value, treatment_value, target_units, effect_modifiers, fit_estimator, method_params)
714 return CausalEstimate(
715 None, None, None, None, None, None, control_value=control_value, treatment_value=treatment_value
716 )
718 if fit_estimator:
--> 719 estimator.fit(
720 data=data,
721 effect_modifier_names=effect_modifiers,
722 **method_params["fit_params"] if "fit_params" in method_params else {},
723 )
725 estimate = estimator.estimate_effect(
726 data,
727 treatment_value=treatment_value,
(...)
730 confidence_intervals=estimator._confidence_intervals,
731 )
733 if estimator._significance_test:
File ~\AppData\Local\anaconda3\Lib\site-packages\dowhy\causal_estimators\econml.py:194, in Econml.fit(self, data, effect_modifier_names, **kwargs)
190 estimator_named_args = estimator_argspec.args + estimator_argspec.kwonlyargs
191 estimator_data_args = {
192 arg: named_data_args[arg] for arg in named_data_args.keys() if arg in estimator_named_args
193 }
--> 194 self.estimator.fit(**estimator_data_args, **kwargs)
196 return self
File ~\AppData\Local\anaconda3\Lib\site-packages\econml\dml\causal_forest.py:854, in CausalForestDML.fit(self, Y, T, X, W, sample_weight, groups, cache_values, inference)
852 if X is None:
853 raise ValueError("This estimator does not support X=None!")
--> 854 return super().fit(Y, T, X=X, W=W,
855 sample_weight=sample_weight, groups=groups,
856 cache_values=cache_values,
857 inference=inference)
File ~\AppData\Local\anaconda3\Lib\site-packages\econml\dml_rlearner.py:422, in _RLearner.fit(self, Y, T, X, W, sample_weight, freq_weight, sample_var, groups, cache_values, inference)
385 """
386 Estimate the counterfactual model from data, i.e. estimates function :math:\\theta(\\cdot)
.
387
(...)
419 self: _RLearner instance
420 """
421 # Replacing fit from _OrthoLearner, to enforce Z=None and improve the docstring
--> 422 return super().fit(Y, T, X=X, W=W,
423 sample_weight=sample_weight, freq_weight=freq_weight, sample_var=sample_var, groups=groups,
424 cache_values=cache_values,
425 inference=inference)
File ~\AppData\Local\anaconda3\Lib\site-packages\econml_cate_estimator.py:131, in BaseCateEstimator._wrap_fit..call(self, Y, T, inference, *args, **kwargs)
129 inference.prefit(self, Y, T, *args, **kwargs)
130 # call the wrapped fit method
--> 131 m(self, Y, T, *args, **kwargs)
132 self._postfit(Y, T, *args, **kwargs)
133 if inference is not None:
134 # NOTE: we call inference fit after calling the main fit method
File ~\AppData\Local\anaconda3\Lib\site-packages\econml_ortho_learner.py:832, in _OrthoLearner.fit(self, Y, T, X, W, Z, sample_weight, freq_weight, sample_var, groups, cache_values, inference, only_final, check_input)
830 nuisances, fitted_models, new_inds, scores = ray.get(self.nuisances_ref[idx])
831 else:
--> 832 nuisances, fitted_models, new_inds, scores = self._fit_nuisances(
833 Y, T, X, W, Z, sample_weight=sample_weight_nuisances, groups=groups)
834 all_nuisances.append(nuisances)
835 self._models_nuisance.append(fitted_models)
File ~\AppData\Local\anaconda3\Lib\site-packages\econml_ortho_learner.py:982, in _OrthoLearner._fit_nuisances(self, Y, T, X, W, Z, sample_weight, groups)
979 else:
980 folds = splitter.split(to_split, strata)
--> 982 nuisances, fitted_models, fitted_inds, scores = _crossfit(self._ortho_learner_model_nuisance, folds,
983 self.use_ray, self.ray_remote_func_options, Y, T,
984 X=X, W=W, Z=Z, sample_weight=sample_weight,
985 groups=groups)
986 return nuisances, fitted_models, fitted_inds, scores
File ~\AppData\Local\anaconda3\Lib\site-packages\econml_ortho_learner.py:284, in _crossfit(models, folds, use_ray, ray_remote_fun_option, *args, **kwargs)
282 nuisance_temp, model_out, score_temp = ray.get(fold_refs[idx])
283 else:
--> 284 nuisance_temp, model_out, score_temp = _fit_fold(model, train_idxs, test_idxs,
285 calculate_scores, accumulated_args, kwargs)
287 if idx == 0:
288 nuisances = tuple([np.full((n,) + nuis.shape[1:], np.nan)
289 for nuis in nuisance_temp])
File ~\AppData\Local\anaconda3\Lib\site-packages\econml_ortho_learner.py:99, in _fit_fold(model, train_idxs, test_idxs, calculate_scores, args, kwargs)
96 kwargs_train = {key: var[train_idxs] for key, var in kwargs.items()}
97 kwargs_test = {key: var[test_idxs] for key, var in kwargs.items()}
---> 99 model.train(False, None, *args_train, **kwargs_train)
100 nuisance_temp = model.predict(*args_test, **kwargs_test)
102 if not isinstance(nuisance_temp, tuple):
File ~\AppData\Local\anaconda3\Lib\site-packages\econml\dml_rlearner.py:53, in _ModelNuisance.train(self, is_selecting, folds, Y, T, X, W, Z, sample_weight, groups)
51 def train(self, is_selecting, folds, Y, T, X=None, W=None, Z=None, sample_weight=None, groups=None):
52 assert Z is None, "Cannot accept instrument!"
---> 53 self._model_t.train(is_selecting, folds, X, W, T, **
54 filter_none_kwargs(sample_weight=sample_weight, groups=groups))
55 self._model_y.train(is_selecting, folds, X, W, Y, **
56 filter_none_kwargs(sample_weight=sample_weight, groups=groups))
57 return self
File ~\AppData\Local\anaconda3\Lib\site-packages\econml\dml\dml.py:91, in _FirstStageSelector.train(self, is_selecting, folds, X, W, Target, sample_weight, groups)
86 if self._discrete_target:
87 # In this case, the Target is the one-hot-encoding of the treatment variable
88 # We need to go back to the label representation of the one-hot so as to call
89 # the classifier.
90 if np.any(np.all(Target == 0, axis=0)) or (not np.any(np.all(Target == 0, axis=1))):
---> 91 raise AttributeError("Provided crossfit folds contain training splits that " +
92 "don't contain all treatments")
93 Target = inverse_onehot(Target)
95 self._model.train(is_selecting, folds, _combine(X, W, Target.shape[0]), Target,
96 **filter_none_kwargs(groups=groups, sample_weight=sample_weight))
AttributeError: Provided crossfit folds contain training splits that don't contain all treatments
Really appreciate if any help can be provided! Thank you very much in advance!!!!
from econml.
Related Issues (20)
- Is a feature engineered from treatment T another treatment to consider for CATE?
- Will DRIV be able to support multiple treatments via multiple instruments?
- Inconsistent ATE estimation HOT 3
- Confidence Interval for categorical outcome HOT 3
- [Bug] fit_cate_incercept argument in econml.dml.DML does not add intercept correctly HOT 5
- `shap_values` for tree-based models doesn't set `check_additivity=False` as expected HOT 3
- A column-vector y was passed when a 1d array was expected (however, y is already a 1d array) HOT 1
- Individual Treatment Effects HOT 1
- How to get the Confidence Interval for ATE instead of CATE HOT 1
- Converting to Python object not allowed without gil HOT 1
- Reproducible error: SHAP ExplainerError: Additivity check failed in TreeExplainer HOT 4
- Questions regarding DRPolicyForest results HOT 2
- DRtester does not work for binary treatment AND binary outcome HOT 4
- Confounder adjusting before applying the ITE model to observational data
- Calculation of confidence intervals in NormalInferenceResults becomes very slow when passing big dataframes HOT 2
- DML discrete outcome HOT 1
- High memory footprint for big dataframes in CausalForest model HOT 3
- Questions about econml and CausalForestDML
- Reduce residual confounding in time series
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from econml.