josef-pkt / misc Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 8.0 452 KB

just a place to store things

Python 28.16% Jupyter Notebook 71.84%

misc's People

Contributors

Stargazers

Watchers

Forkers

haoybl winteraspect grumpyjun knut0815 jjuliaa09 econbernardo larswrede scai5

misc's Issues

SUMM: What's the purpose of the estimation?

Why do we estimate? What are we using the results for?

parameter estimates: how large is this effect? And is there even an effect?
implied summary statistics: what's the average treatment effect on this subgroup? What's the average marginal effect? risk ratio, ...
predicting mean: What's the expected value for this new observation?
predicted distribution: What's this distributional property of a new observation? prediction confidence interval, tail probabilities, ...

Questions while pondering model diagnostics and goodness or lack of fit measures
e.g. Poisson and all LEF estimate the expected value consistently even if the rest is misspecified. But Poisson is not a good model for the distribution of overdispersed data.
When do we care about distributional fit diagnostic tests and measures?

Task: Collect examples and recipes

I wrote quite a lot of examples and recipes over time that are "misplaced" somewhere.
Collect and check what's more widely useful and hasn't been included in statsmodels yet.

Generalized IRF and FEVD for VAR

Causality, potential outcome - what's the point?

I sneaked some questions into the statsmodels wiki
https://github.com/statsmodels/statsmodels/wiki/SMEP:-Causality-and-Treatment-Effects

todo: read "Mostly Harmless" and ... and ...
(Imbens and Wooldridge)

SUMM: Sieve and Series estimation

References

Xhiahong Chen handbook article

Generic Estimator classes

Towards more generic estimator frameworks

least squares
generalized linear model GLM
MLE
Quasi-MLE
GEE
GMM
M-estimators
minimum distance estimators

All the above are extremum estimators. GEE is usually formalized as estimating equations. Besides this we can also have estimators that directly specify the estimating equations, with maybe no consistent objective function for a minimum. Examples are in robust estimation with estimating equations for mean and scale specified separately. Maybe feasible GLS would also be in this category as estimation method without the extra MLE interpretation.
Two points: The estimators like OLS, GLM, GEE, RLM solve the estimating equations directly without going through an objective function, OLS, GLM have a (Q)MLE interpretation, RLM is an M-estimator. Even in cases like robust estimators, we could cast them in a exactly identified GMM framework, however similar to FGLS, and GLM/LEF, they exploit asymptotic independence or independence in expectation of the mean and variance terms.

What's the hierarchical tree?
What is common? What is specific?

How can we structure the code with class hierarchies, mixins and function for best code reuse and flexibility?
(Finetuning and special casing can always be done on the lowest level, so that's not directly relevant for the overall structure.)

When or how do we combine different estimators in one model? (e.g. MLE and QMLE, MLE and GMM/OLS, MLE and M-estimator)
example: cov_type makes MLE into QMLE, OLS/WLS mixes MLE/QMLE and LS/GMM

Components differ whether we want

full model with everything
prediction and forecasting only (machine learning style)
inference only: hypothesis tests (although we are moving towards simplified model for this, e.g. contingency table, and weightstats)

Task: Do something

"Are you pondering what I'm pondering?"

"Deep Thoughts" by ~~J.H.~~ J.P.

numerical algorithm - speed versus convenience versus robustness

some thought on choosing algorithms, algorithm libraries

linear algebra

high level that are easy to write and read: inv, pinv, solve
low level that can be numerically more efficient and finetune results reuse: SVD, QR, ...

sparse versus loops versus pandas groupby

What's the best approach for block structures, group handling and strata
examples:
kronecker product in system of equations,
balanced or unbalanced panel or cluster data

nearest neighbor, matching

still pretty unused: direct interval calculation for 1-D, kdtree, ball point

examples:
local regression (lowess, KernelRegression)
matching algorithms: multiple imputation, propensity score matching

SUMM: minimum distance estimation for misspecified models

Suppose we only have "wrong" models, what's the "best" model and our "best" estimates?

And, how do you define "best"?

Motivation:
Gamma regression: Greene compares MLE versus GMM, where GMM uses overidentified moment conditions. Gamma regression is in GLM/LEF and uses mean and variance assumptions.

preliminaries

GLM/LEF can estimate the mean/expected value if only the mean is correctly specified. The estimates are consistent even if these other parts are misspecified. However, what happens if we are not exclusively interested in the mean? What's the "best" Poisson model if our data is overdispersed, the mean function is possibly misspecified and we are also interested in predicting quantiles?
Misspecified MLE minimizes Kullback-Leibler distance to the true model and parameters converge to pseudotruevalues.
choosing by optimal prediction accuracy is only a special case with some distance measure for prediction accuracy and mostly used for hyperparameter and model selection (with cross-validation) but usually not for parameter estimation itself. (special case: time series models that minimize squared one step ahead forecast errors for the parameter estimation).

So, What's the point?

provide and use more general minimum distance estimators
be clear what the objective function is, i.e. distance measure
focus on implied final statistic (e.g. average effect, prediction)

... ???