Comments (6)
A better model would be the one that actually corresponds to the data generating process. To achieve that it should include the income information off all categories:
with pm.Model() as m_11_13:
a = pm.Normal('a', 0., 1., shape=2)
b = pm.HalfNormal('b', 0.5)
s1 = a[0] + b*income[0]
s2 = a[1] + b*income[1]
s3 = 0 + b*income[2]
p = tt.nnet.softmax(pm.math.stack([s1, s2, s3]))
obs = pm.Categorical('career', p=p, observed=career)
trace_11_13 = pm.sample(target_accept=.95)
The model above runs just fine and makes correct inferences. Equivalently, if you want to shift one of the categories to be zero before the softmax transformation (which is not needed), you should do the following:
with pm.Model() as m_11_13_alt:
a = pm.Normal('a', 0., 1., shape=2)
b = pm.HalfNormal('b', 0.5)
s1 = a[0] + b*(income[0] - income[2])
s2 = a[1] + b*(income[1] - income[2])
s3 = 0
p = tt.nnet.softmax(pm.math.stack([s1, s2, s3]))
obs = pm.Categorical('career', p=p, observed=career)
trace_11_13_alt = pm.sample(target_accept=.95)
Which is not the same as simply setting the unnormalized probability of the pivot category to zero, as the original model was doing. I think this stems from a confusion (which I also used to have) about conventional multinomial models where the predictors are shared and different choices have different coefficients. In that more common case, setting the pivot coefficients (and intercept) to zero, is exactly the same as setting the unnormalized probability of the pivot category to zero.
Here is a Colab Notebook showing the results: https://colab.research.google.com/drive/1GajarpZ3M-QLs7pQWLDpoqC-YOWBp7aU?usp=sharing
from pymc-resources.
Thanks for your insight @ricardoV94 ! Indeed this original model has some problems. I did try to bound the b
coefficient but (I don't remember why exactly but I remember banging my head against the wall quite a lot 😅 ) this didn't work at all -- I think it's related to the choice of pivot.
I reached out to R. McElreath about this but he never answered. So I opted for the model that seemed the most accurate and close to the book to me (and as you can see, b
is indeed inferred to be positive when you take the right pivot).
As a result, I don't think it's an issue with the PyMC model per se, so I'm closing this, but feel free to reopen if you think you can do a PR with a better model -- that'd be awesome!
from pymc-resources.
Thanks a lot @ricardoV94, this makes sense and is very interesting!
Wanna make a PR to implement this in the NB? Also, do you happen to have a reference that helped you clear that confusion between the two types of multinomial models?
from pymc-resources.
I can do the PR if you think it makes sense to include this model which is different from the one implemented in the book (I guess the current one is already different).
Two sources really helped me clear the confusion:
- Hoffman, S. D., & Duncan, G. J. (1988). Multinomial and conditional logit discrete-choice models in demography. Demography, 25(3), 415-427. pdf link
- Croissant, Y. (2020). Estimation of Random Utility Models in R: The mlogit Package. Journal of Statistical Software, 95(1), 1-41. pdf link
This second paper describes the R library mlogit, which can run these types of multinomial models. You can jump to section 2.2. Model description
for the relevant distinction between shared and unique coefficients / covariates.
The terms are a bit confusing, but this gets to the heart of if:
It is clear from the previous expression that coefficients of choice situation specific variables
(the intercept being one of those) should be alternative specific, otherwise they would disap-
pear in the differentiation. Moreover, only differences of these coefficients are relevant and can
be identified. For example, with three alternatives 1, 2 and 3, the three coefficients γ 1 , γ 2 , γ 3
associated to a choice situation specific variable cannot be identified, but only two linear
combinations thereof. Therefore, one has to make a choice of normalization and the simplest
one is to simply set γ 1 = 0.
Coefficients for alternative and choice situation specific variables may (or may not) be al-
ternative specific. For example, transport time is alternative specific, but 10 min in public
transport may not have the same impact on utility than 10 min in a car. In this case, al-
ternative specific coefficients are relevant. Monetary cost is also alternative specific, but in
this case, one can consider that 1$ is 1$ whether it is spent for the use of a car or in public
transports. In this case, a generic coefficient is appropriate.
He doesn't write it out loud, but the second type of coefficients do not need to be pivoted, as you can also see from the output of the model in section 3.5. Application
.
The author also clears a bit the confusion between different terms people have been using:
A logit model with only choice situation specific variables is sometimes called a multinomial
logit model, one with only alternative specific variables a conditional logit model and one with
both kind of variables a mixed logit model. This is seriously misleading: conditional logit
model is also a logit model for longitudinal data in the statistical literature and mixed logit
is one of the names of a logit model with random parameters. Therefore, in what follows, we
will use the name multinomial logit model for the model we have just described whatever the
nature of the explanatory variables used.
from pymc-resources.
This is super useful! Honestly, I think this would clearly be valuable to make a PR out of it (and you can briefly summarize the problem and differences as you did there + link to the two references).
As the original model from the book seems to have issues, me might as well err towards a stratistically sounder one, while expliciting our choice -- and as you said, the PyMC3 model is already different 🤷♂️
Reopening as a consequence, to link the issue to the PR 😉
from pymc-resources.
Will do ;)
from pymc-resources.
Related Issues (20)
- Add Bayes Rules Chapter 2 HOT 1
- Fix up env for bayes rules
- Chapter 5 - Code 5.29 should use log(mass) HOT 3
- Chp_02.ipynb, import statement HOT 2
- Conflicting chain size for code block 4.59
- causalgraphicalmodels import error with Python 3.10 HOT 5
- Chapter 5 import error when running pymc model HOT 9
- Chp_11 notebook - models not running - Got error No model on context stack
- Incompatibilities with Python 3.10 for the aesara 2.6.6 and causalgraphicalmodels-0.0.4 packages.
- issue importing pymc3 "cannot import as scalar from numpy"
- Error for code snippet 8.8 - NotImplementedError HOT 3
- Adding exposition about PyMC and Arviz to the Rethinking_2 code.
- Type error - Code 15.12 - pymc v4 HOT 4
- Rethinking_2/Chp_02.ipynb: Mistake in quadratic approximation ("Code 2.6")? HOT 5
- Resources/Rethinking_2/Chp_04.ipynb: The hyperlink in Code 4.26 is broken HOT 1
- Gender Categories in Chapter 4
- rethinking chpater5 notebook: 5.34 ValueError: shape mismatch: objects cannot be broadcast to a single shape. HOT 2
- Using pymc (v5) to run rethinking notebooks HOT 1
- Sampling performance b-splines model chapter 04 HOT 1
- pre-commit is failing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pymc-resources.