Hi @jhelvy,
First of all, thanks a lot for this contribution. I am usually using python, but logitr managed to solve some convergence issues I was having with xlogit using panel data.
My current question/issue revolves around the following: I have estimated a mixed logit model on a panel of individuals in a set of tasks/problems. Now suppose I have a separate panel data set containing the same individuals on which I would like to make predictions. Using the (unconditional) estimated distribution over the parameters in order to make predictions is then not optimal, since we already have additional information on them from their prior choices. To be specific, let $g(\beta|\theta)$ be the population distribution of the parameters $\beta$, let $L(i,t|\beta)=\frac{e^{\beta'X_{it}}}{\sum_j e^{\beta'X_{it}}}$ be the probability of choosing $i$ in task $t$ conditional on $\beta$. Then, by Bayes' rule, the distribution over parameters conditional on having observed a sequence of choices $y$ is given by:
$$h(\beta|y,\theta)=\frac{P(y|\beta)g(\beta|\theta)}{P(y|\theta)}$$
Where $P(y|\beta)= L(y_1,1|\beta)\times\dots\times L(y_T,T|\beta)$ is the probability of the individual's sequence conditional on $\beta$ and $P(y|\theta)=\int P(Y|\beta)g(\beta|\theta)d\theta$ the unconditional probability. Based on this, an individual's estimated probability of choosing $i$ in out-of-sample task $T+1$ is given by:
$$\tilde{P}(i, T+1|y,\theta)=\frac{\sum_{r}L(i, T+1|\beta^r)P(y|\beta^r)}{\sum_{r}P(y|\beta^r)}$$
I should note that the above notation is from Revelt & Train (2000): "Customer-Specific Taste Parameters and Mixed Logit: Households' Choice of Electricity Supplier."
From my (limited) understanding of R, your predict method uses the population distribution over parameters to make predictions and does not allow for a panelID
option to use the conditional distribution, is that correct? If so, do you know of any way I could use logitr to (1) derive the conditional distribution for each individual, and (2) make predictions based on this conditional distribution?
On a unrelated note, I think that I have spotted to bugs:
- If I estimate a multinomial logit using a single parameter, I get the following error when executing the summary method if I specify a
clusterID
:
![image](https://user-images.githubusercontent.com/28649389/183884894-d112dd35-720a-4ece-bcf7-d49a63867bea.png)
Note that it works for two or more parameters. Furthermore, the summary method also work for a single parameter if I leave out clusterID
.
- If I estimate a mixed logit using a single parameter, I get the following error in the estimation if I specify
clusterID
:
![image](https://user-images.githubusercontent.com/28649389/183885386-2fd3951c-f0a1-438e-9134-4b5c2093e2bf.png)
Note that the estimation works for two or more parameters. The estimation also works for a single parameter if I leave out clusterID
.
Many thanks in advance for your time.