Multinomial Logit With Latent Class: mlogit_latent_class.py
The utils.py file contains stable version of softmax function and numeric hessian approximation via the central finite difference. The Hessian matrix may not be invertible or the inverted matrix has non-positive values in diagonal, so standard error is calculated by SVD decomposition on the Hessian and inverted diagonal matrix.
Note, the scipy.optimize.minimize changes beta when calculating gradient and hessian, so caching forward calculation is not necessary.
Standard Multinomial Logit
Assume each observation is IID.
Notation
N is number of observation, K is number of choices, M is number of features, X: [N,T,K,M], Y: [N,T], and $\beta$: [M,]
Utility for choice $k$ is
$$
U_{k} = X_{k}\beta
$$
Let $M = \max X_k \beta$. The probability of choosing $k$ is
or use SVD to decompose $H$ and then take the inverse of the diagonal matrix D.
Latent Class
Notation: H is individual level to determine latent class probability. Q is number of latent class. $P_{tkq}$ is predicted probability for choice k in latent class q at time t.
Data: Y:[N,K], X:[N,K,M], H:[N,T,L]
Parameter: $\beta$: [M,Q], $\gamma$:[L,Q]
Model: