Comments (10)
Wait, so is the idea to first generate 30 synthetic data sets using 30 randomly drawn sets of parameter values, plus process noise, and then run MCMC on each of those 30 synthetic datasets? Is the goal to see whether the original 30 sets of parameter values can be recovered?
And I'm not sure I understand the aim of step 4 - why rank them?
from bayessd.
goal to see whether the original 30 sets of parameter values can be recovered?
Yes the idea is to verify the generator we are using. However, I think there are ways to extend this to validation (real data, not synthetic) e.g. use real Y, not
the aim of step 4 - why rank them?
Thanks for this question. This method, called simulation based calibration, is what Bayesian computation people use for verification (to see how statistical and computational model match well). There exist many package that allows this, (e.g. R package introduced by Andrew here:
https://statmodeling.stat.columbia.edu/2022/02/17/sbc-a-new-r-package-for-simulation-based-calibration-of-bayesian-models/
from bayessd.
Do you have an existing mdl
file that incorporates process noise in the same way that you're injecting it into the Stan version of the model?
As we discussed there are a few ways to create the 30 sets of random draws, the key distinction here being whether you want to 1) rely on Vensim's random number generation, or 2) use exactly the same 30 sets of values (and process noise stream as well) to get an exact replication of the random component you have in Stan.
My inclination is that you should do the former but it's up to you. The former would mean the datasets produced are not expected to be identical, just statistically comparable. The latter would be more complicated to implement.
Also, doing step 1 and 2 is easy enough but step 3 and 4 would be outside of Vensim / VST's current capabilities - instead you'd get 30 exported CSVs or similar files with 100 sets of parameter estimates each, that you could then use to calculate and rank log likelihoods elsewhere (Python, R, etc.) (@tomfid is there a way to get the corresponding payoff for each accepted point in the MCMC sample in Vensim? if the likelihood is constructed manually as a variable in the model?)
(Sorry I'm working quite slowly yesterday and today, have been dealing with sick baby)
from bayessd.
One of the files generated in an MCMC run ends with _MCMC_points.tab, and it contains the payoff as well as the parameters for each point. It includes rejected points, so you'd have to filter a bit to match the sample.
from bayessd.
One of the files generated in an MCMC run ends with _MCMC_points.tab, and it contains the payoff as well as the parameters for each point. It includes rejected points, so you'd have to filter a bit to match the sample.
Is there an easy way to do that (e.g. unique point identifier) or is it just a matter of matching on the parameter value sets that are in the sample file?
from bayessd.
Now that I think about it, using the Vensim RNG for process noise is fine, but it would take a little care to get it behaving the same way as the Stanified version. The reason is that TIME STEP=.0625, but the point vector for process noise has 20 points at unit time intervals (we think), interpolated in between. In Vensim, you'd normally either SMOOTH a white noise input to filter the high frequencies, or use SAMPLE IF TRUE to update the noise in a stepwise fashion. Neither of these exactly matches the linear interpolation in the Stanified version. You could use VECTOR LOOKUP to get the same linear interpolation behavior I guess.
from bayessd.
Points are coded - status codes are as follows:
-2 Payoff error (e.g., an FP error or negative payoff that should represent a likelihood)
-1 Rejected
0 Initial
1 Accepted
2 Repeated
3 Accepted during burnin
4 Repeated during burnin
5 Accepted, but above payoff sensitivity threshold
6 Repeated, but above payoff sensitivity threshold
7 Improvement on best payoff (this will duplicate a point reported as 0-6)
So, you'd want the 1s and 2s I guess.
from bayessd.
For stanify, this commit resolves this.
@tseyanglim, for hierarchical Bayesian, would using VST have some edge above Venpy? I feel as you have operated on hierarchical Bayesian, VST would have better treatments for analysis + plots with subscripts.
@tomfid, could you elaborate the process of the following plots for MCMC prey-predator here?
I am asking because @ali-akhavan89 shared during 879 seminar, he used excel for the following analysis and wonder if there are ways to directly do this on python.
from bayessd.
Q answered in other thread.
I did the same sort of ground-truth analysis for the spatial/hierarchical pred prey model:
Basically everything works - all parameters are within expected +/- 2 SD ranges of the estimates (I didn't get around to plotting actual distributions yet, so nongaussian behavior might be an issue).
Since the pred/prey model has a bigger parameter space, hierarchy, and (probably) is more nonlinear, I would expect it to be a harder problem. Therefore I'd interpret the result above as a convergence failure, or maybe some kind of identifiability problem.
from bayessd.
I guess another possibility in @ali-akhavan89 's slide is that "Least Squares" implies that the Normal calibration payoff distribution was used instead of Gaussian. That's a pure sum of squares, omitting the /2 and LN(sd) factors in the likelihood. The sd doesn't matter as long as you're not estimating it, but the missing /2 means the log likelihood is bigger. That would make bounds too tight. You can compensate by setting the MCMC temp=2.
from bayessd.
Related Issues (20)
- Prevent Active initial HOT 1
- Should we keep two separate stan function files for models with different numeric assumtion?
- Verifying chaos in population dynamics HOT 1
- Exclude Vensim binaries and backups from repo
- comparing old and new movie data HOT 3
- Use Xarray for testing
- 🗺️map for startup compass🧭
- scott, todd, jb, angie's nsf triplet on genome HOT 2
- prototype1 of tech (prob.comp.) in market (bayes.ent) HOT 2
- 15.S03: Metascientific examination of social & behavioral sciences reflection (test:Abdullah Almaatouq) HOT 3
- TEPE tested by Josh Lerner
- ⏰startup activity sequence | clockspeed with NUS team HOT 2
- preparing stancon during summer
- finish 15s.03 report on pairing, routing for max used NSP
- momentum strategy
- read Work of Art in the Age of Mechanical Reproduction
- read Homo Ludens
- read Suchman's Plans and Situated Actions or Human–Machine Reconfigurations (PSA 2e) HOT 2
- acculturate open science with collaborators
- acculturate github workflow
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bayessd.