data4dm / stanify Goto Github PK

View Code? Open in Web Editor NEW

2.0 3.0 1.0 238.66 MB

Bridging System dynamics tool (Vensim) and Bayesian computation tool (Stan)

Home Page: https://data4dm.github.io/stanify/

Python 3.33% Batchfile 0.06% Jupyter Notebook 96.33% Shell 0.01% Stan 0.28%

system-dynamics probablistic-language probablistic-machine-learning

stanify's People

Contributors

Stargazers

Watchers

Forkers

dashadower

stanify's Issues

Putting prior, prior predictive, posterior into one inferencedata for hierarchical sbc

Two large tasks, discussed with @Dashadower.

draws_data_mapping wise

To implement this, first apply this arviz sbc_cmdstanpy by @OriolAbril to stanify for S3R1 case. From commit 3b65e77, I marked three parts for code injection. Then try S3R2 case.

With current implementation (before arviz wrapper code injection), S3R1 runs, creating one generator.nc and three estimator.nc files.

However, S3R2 doesn't run with the following error which requires in-depth exploration for size of each data.

Traceback (most recent call last):
  File "/Users/hyunjimoon/Dropbox/stanify/prey_predator.py", line 35, in <module>
    draws2data2draws('vensim_models/prey_predator/prey_predator.mdl', setting, numeric, prior, S, M, N, R)
  File "/Users/hyunjimoon/Dropbox/stanify/stanify/calibrator/draws_data_mapper.py", line 168, in draws2data2draws
    obs_dict = {k: v.values.flatten().reshape((model.n_t, R)) for (k, v) in obs_s_xr[setting_obs].items()} #xr.concat([ posterior_lst])
  File "/Users/hyunjimoon/Dropbox/stanify/stanify/calibrator/draws_data_mapper.py", line 168, in <dictcomp>
    obs_dict = {k: v.values.flatten().reshape((model.n_t, R)) for (k, v) in obs_s_xr[setting_obs].items()} #xr.concat([ posterior_lst])
ValueError: cannot reshape array of size 200 into shape (200,2)

builder (auto stan file generator) wise

update for three parts:

draws2data gq block
from

real pred_birth_frac = normal_rng(0.05, 0.005);

 for (region in 1:r)
        pred_birth_frac[region] = normal_rng(0.05, 0.005);

data2draws parameters block
from

real<lower=0> pred_birth_frac;

array[region] real<lower=0> pred_birth_frac;

data2draws model block
from

pred_birth_frac ~ normal(0.05, 0.005);

    for (region in 1:r)
        pred_birth_frac[region] ~ normal(0.05, 0.005);

System for maintaining sbc.nc file

Previous code plots given one generator and S estiamtor.nc, now that all are stored in sbc.nc, we should redo coding

Different dimension for pre-defined kwargs and data

With S = 5, Q = 3, R = 2 setting and

    idata_kwargs = dict(
        prior_predictive=["prey_obs", "predator_obs"],
        posterior_predictive=["prey_obs", "predator_obs"],
        coords={
            "time": [n for n in range(N)],
            "stock": [q for q in range(Q)],
            "region": [r for r in range(R)]
        },
        dims = {
            'initial_outcome': ["stock"],
            'integrated_result': ["time", "stock"],
            'prey': ["time", "region"],
            'process_noise':  ["time",  "region"],
            'predator': ["time", "region"],
            "prey_obs": ["time", "region"],
            "predator_obs": ["time", "region"],
        }
    )

I get the following error msg:

different number of dimensions on data and dims: 3 vs 4

when I print the shape, dims, coords of where the error event happens (elif len(dims) != len(shape):)

from xarray shape :  (1, 5, 200) dims ['chain', 'draw', 'time', 'region'] coords {'chain': <xarray.IndexVariable 'chain' (chain: 1)>

region is missing from shape of data, i.e. prior predictive from stan sample(). Am debugging based on the _infer_coords_and_dims xarray function. More detailed msg for this function has been raised.

_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4]), 'initial_outcome_dim_0': <xarray.IndexVariable 'initial_outcome_dim_0' (initial_outcome_dim_0: 3)>
array([0, 1, 2])}, ['chain', 'draw', 'initial_outcome_dim_0'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4]), 'integrated_result_dim_0': <xarray.IndexVariable 'integrated_result_dim_0' (integrated_result_dim_0: 200)>
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
...
       196, 197, 198, 199]), 'integrated_result_dim_1': <xarray.IndexVariable 'integrated_result_dim_1' (integrated_result_dim_1: 3)>
array([0, 1, 2])}, ['chain', 'draw', 'integrated_result_dim_0', 'integrated_result_dim_1'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4]), 'process_noise_dim_0': <xarray.IndexVariable 'process_noise_dim_0' (process_noise_dim_0: 200)>
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
...
       196, 197, 198, 199])}, ['chain', 'draw', 'process_noise_dim_0'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4]), 'prey_dim_0': <xarray.IndexVariable 'prey_dim_0' (prey_dim_0: 200)>
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
...
       196, 197, 198, 199])}, ['chain', 'draw', 'prey_dim_0'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4]), 'predator_dim_0': <xarray.IndexVariable 'predator_dim_0' (predator_dim_0: 200)>
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
...
       196, 197, 198, 199])}, ['chain', 'draw', 'predator_dim_0'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4]), 'prey_obs_dim_0': <xarray.IndexVariable 'prey_obs_dim_0' (prey_obs_dim_0: 200)>
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
...
       196, 197, 198, 199])}, ['chain', 'draw', 'prey_obs_dim_0'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4]), 'predator_obs_dim_0': <xarray.IndexVariable 'predator_obs_dim_0' (predator_obs_dim_0: 200)>
array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,
...
       196, 197, 198, 199])}, ['chain', 'draw', 'predator_obs_dim_0'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])
_infer_coords_and_dims(data.shape, coords, dims) ({'chain': <xarray.IndexVariable 'chain' (chain: 1)>
array([0]), 'draw': <xarray.IndexVariable 'draw' (draw: 5)>
array([0, 1, 2, 3, 4])}, ['chain', 'draw'])

chain, draw should be reset_indexed in advance.

Divergence using beta (2,2) for measurement noise scale's prior

Using beta distribution for measurement noise scale's prior, I faced the following divergence. Perhaps the upper bound of beta as 1 was too small, meaning measurement noise scale can be over 1 from the synthetic data.

	Exception: beta_lpdf: Random variable is 1, but must be in the interval [0, 1] (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/HW7_estag_mnoisebeta22_pnoisedot01/HW7_estag_mnoisebeta22_pnoisedot01_data2draws.stan', line 45, column 4 to column 31)
	Exception: beta_lpdf: Random variable is 1, but must be in the interval [0, 1] (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/HW7_estag_mnoisebeta22_pnoisedot01/HW7_estag_mnoisebeta22_pnoisedot01_data2draws.stan', line 45, column 4 to column 31)
Consider re-running with show_console=True if the above output is unclear!
17:24:11 - cmdstanpy - WARNING - Some chains may have failed to converge.
	Chain 1 had 89 divergent transitions (35.6%)
	Chain 2 had 71 divergent transitions (28.4%)
	Chain 3 had 86 divergent transitions (34.4%)
	Chain 4 had 65 divergent transitions (26.0%)

mkdir stanfile edge cases

stanfile is not generated when

mdl file is in "/inventory/compare_vensim_stan.mdl" subfolder format

works:
vensim_model_path = "/Users/hyunjimoon/Dropbox/tolzul/BayesSD/code/pipeline/prescriptive/stanify/vensim_models/compare_vensim_stan.mdl"

not works:
vensim_model_path = "/Users/hyunjimoon/Dropbox/tolzul/BayesSD/code/pipeline/prescriptive/stanify/vensim_models/inventory/compare_vensim_stan.mdl"

stan_files folder should pre-exist under vignette folder before running notebook

e.g.

sbc rank histogram bug

Under the setting


precision ={
    "S": 1, # # of draws from prior
    "M": 500, # # of draws from posterior (# of chains * # of draws from each chain)
    "N": 30, # # of observation
    "R": 8 # # of subgroups for hierarchical Bayes
}

setting = {
    "est_param_names" : ["adj_frac1_loc","adj_frac2_loc", "adj_frac3_loc", "adj_frac1", "adj_frac2","adj_frac3"],# "ss2p_frac4"],# "ss2p_frac4"
    "hier_est_param_names": ["adj_frac1", "adj_frac2", "adj_frac3"],# "ss2p_frac4"],
    "target_simulated_vector_names" : ["ss", "s"],
    "driving_vector_names" : ["exog_demand", "process_noise_normal_driving"],
    "model_name": "2hier_s_lgnorm_.01"
}
init_order = 100

the following rank figure is observed; the total sum (2+2+1+3 = 8) which means each hierarchy is treated separately when calculating rank.

Production start rate stocked obs time series converge to zero

Parameter retrieval seems ok

updating time step to savepars

Data4DM/BayesSD#94 (reply in thread)

Decoupling input (S, M, N, P, Q) and output (stan_files, data, plot)

There is no need to translate .mdl vesim to .stan stan, compile to .hpp(+unix exec) when M (number of posterior daraws is only thing changing. However, M changes data and plot folder. Workflow considering the following dependency is needed.

S, N, P, Q -> M
stan_files -> data -> plot

Sampling restart: failed to integrate to next output time (0.01) in less than max_num_steps steps

09:58:02 - cmdstanpy - INFO - CmdStan start processing
chain 1 |          | 00:00 Status
chain 1 |▊         | 00:46 Status
chain 2 |▊         | 01:16 Status
chain 1 |█▋        | 02:39 Iteration:    1 / 1050 [  0%]  (Warmup)
chain 2 |██▌       | 02:58 Iteration:  100 / 1050 [  9%]  (Warmup)
chain 1 |██▌       | 03:35 Iteration:  100 / 1050 [  9%]  (Warmup)
chain 1 |███▎      | 04:17 Iteration:  200 / 1050 [ 19%]  (Warmup)
chain 2 |█████     | 04:24 Iteration:  400 / 1050 [ 38%]  (Warmup)
chain 1 |████▏     | 04:54 Iteration:  300 / 1050 [ 28%]  (Warmup)
chain 1 |█████     | 05:27 Iteration:  400 / 1050 [ 38%]  (Warmup)
chain 2 |███████▌  | 05:33 Iteration:  700 / 1050 [ 66%]  (Warmup)
chain 1 |█████▊    | 05:58 Iteration:  500 / 1050 [ 47%]  (Warmup)
chain 2 |█████████▏| 06:20 Iteration:  900 / 1050 [ 85%]  (Warmup)
chain 1 |██████▋   | 06:27 Iteration:  600 / 1050 [ 57%]  (Warmup)
chain 1 |██████████| 07:56 Sampling completed                       
                                                                                                                                                                
chain 2 |██████████| 07:56 Sampling completed                       
10:05:59 - cmdstanpy - INFO - CmdStan done processing.
10:05:59 - cmdstanpy - WARNING - Non-fatal error during sampling:
Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 41, column 8 to column 184)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 41, column 8 to column 184)
Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 41, column 8 to column 184)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 41, column 8 to column 184)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 53, column 8 to column 59)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/pp_hierORsbc/pp_hierORsbc_data2draws.stan', line 41, column 8 to column 184)
Consider re-running with show_console=True if the above output is unclear!
est_param pred_birth_frac
10:06:01 - cmdstanpy - INFO - CmdStan start processing
chain 1 |          | 00:00 Status
chain 2 |          | 00:00 Status

Prior_draw multiIndex cannot be serialized while storing inferencedata as netcdf file

When I attempted to store idata as .nc file,

    idata = az.from_cmdstanpy(
        prior=draws2data_data, **idata_kwargs
    ).stack(prior_draw=["chain", "draw"], groups="prior_groups")

I face multiindex nonserializable error.

NotImplementedError: variable 'prior_draw' is a MultiIndex, which cannot yet be serialized to netCDF files (https://github.com/pydata/xarray/issues/1077). Use reset_index() to convert MultiIndex levels into coordinate variables instead.

multiple subscript

@Dashadower could you document why you decide using array for multiple subscript?

Design comparison metric

From abstracted mechanism of draws2data2draws design the following compare metric using Martin's test quantities.

	def diagnose(prior_sample, posterior_sample, ,target_simulated_obs, test_quantity):
	    return compare(test_quantity(prior_sample, target_simulated, target_simulated_obs, target_obs), 
	    		   test_quantity(posterior_sample, target_simulated, target_simulated_obs, target_obs))

Recycling model creates dataFunc twice in function file

I brought up the following (bug?) a few weeks before to @Dashadower, and it seems reusing the model object is the cause. Would there be any way to NOT REINITIALIZE but only inform model of the change?

'/Users/hyunjimoon/Dropbox/stanify/stan_files/prey_predator_S1N200Q2R1_S1N200Q2R3_S1N2000Q2R3/draws2data.stan', line 2, column 2:
   -------------------------------------------------
  1202:  }
  1203:  
  1204:  real dataFunc__process_noise_uniform_driving(real time, real time_step){
         ^
  1205:      // DataStructure for variable process_noise_uniform_driving
  1206:      real slope;
   -------------------------------------------------

Function 'dataFunc__process_noise_uniform_driving' has already been declared

Testing new stanify - replicated? datastructure?

@Dashadower compare_vensim_stan.mdl.zip below is a vensim file with the datastructure you've requested -- Light blue variable exog demand is datastructure and dark blue table for order fulfillment ratio is the table function. Could you please check the below?

setting value for exog demand makes this variable no longer datastructure. If my memory is correct, you were trying to receive data only from vensim. However, I don't know how this can be implemented other than the original method (supplying value in python after declaring datastructure in vensim). Happy to be corrected!
time series patterns are replicated in the result from stan's draws2data. For easy comparison, I've set exog demand is 2 * Time + 100 (this requires changing exog demand type (so it's not datastructure anymore ) as shown below.

With the value above, please make sure the overshoot of adj ss and smooth adjustment of adj s are visible in the stan-generated time series.

compare_vensim_stan.mdl.zip

kwargs update to include observed data causes problem in to dataset

File ~/Dropbox/stanify/venv/lib/python3.10/site-packages/arviz/data/base.py:316, in dict_to_dataset(data, attrs, library, coords, dims, default_dims, index_origin, skip_event_dims)
    303 if dims is None:
    304     dims = {}
    306 data_vars = {
    307     key: numpy_to_data_array(
    308         values,
    309         var_name=key,
    310         coords=coords,
    311         dims=dims.get(key),
    312         default_dims=default_dims,
    313         index_origin=index_origin,
    314         skip_event_dims=skip_event_dims,
    315     )
--> 316     for key, values in data.items()
    317 }
    318 return xr.Dataset(data_vars=data_vars, attrs=make_attrs(attrs=attrs, library=library))

AttributeError: 'list' object has no attribute 'items'

I will first delete obs but we need this for plot_ppc()

speed up with shared metric information

stan returns metric it will use, after warmup period, with which it would sample during sampling period. ideas like metric sharing can shorten the estimation time -- but without much success during the last experiment.

{"inv_metric": [0.00890483, 0.0146411, 0.0180988, 0.0173486, 0.0162696, 0.0185366, 0.0186761, 0.0176091, 0.0177719, 0.0194737, 0.0269911, 0.0241012, 0.0270683, 0.0305894, 0.0295884, 0.0285179, 0.0270491, 0.0247198, 0.00094971]}

wrong parentheses order in builder

vensim

((2-(SAVEPER * corr frac))/(SAVEPER * corr frac))^0.5 * process noise normal driving * process noise scale

is translated as below:

((2 - saveper * corr_frac) / saveper * corr_frac) ^ 0.5 * dataFunc__process_noise_normal_driving(time, time_saveper) * process_noise_scale;

which is wrong

Documentation and refactoring for initialization

I don't know a easy way to combine this repository with the other in BayesSD, so may I ask for some advice on the best way to merge stanify repo with the following commit Data4DM/BayesSD@b128c3e ? @Dashadower?

Documentation on what different structure means and different parameter names should be clearly defined.

callstructure: auxiliary
reference structure:
ignored variables: variables defined as data and all stock
exposed parameter: ode function's input
stan_variable is larger set than vensim parameter

Permission denied with two .nc files

Traceback (most recent call last):
  File "/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/demo_draws2data2draws.py", line 31, in <module>
    draws2data2draws('vensim_models/mng_pop/mng_pop.mdl', setting, numeric, prior, S, M, N)
  File "/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stanify/calibrator/draws_data_mapper.py", line 83, in draws2data2draws
    posterior_s = data2draws(model, numeric, chains=4, iter_sampling=int(M/4))
  File "/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stanify/calibrator/draws_data_mapper.py", line 37, in data2draws
    post_ppc.draws_xr().to_netcdf(f"{data_path}/estimator.nc")
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/core/dataset.py", line 1882, in to_netcdf
    return to_netcdf(  # type: ignore  # mypy cannot resolve the overloads:(
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/backends/api.py", line 1193, in to_netcdf
    store = store_open(target, mode, format, group, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 384, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 332, in __init__
    self.format = self.ds.data_model
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 393, in ds
    return self._acquire()
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 387, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/opt/homebrew/Cellar/[email protected]/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/backends/file_manager.py", line 189, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/opt/homebrew/lib/python3.10/site-packages/xarray/backends/file_manager.py", line 207, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2463, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 2026, in netCDF4._netCDF4._ensure_nc_success
PermissionError: [Errno 13] Permission denied: b'/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/data/draws2data2draws_mngpop_multiS2/estimator.nc'

Documenting context type classification

SBC inputs are classified into three, which are typified as python classes. Precision and Stan_model context are dataclass, and only VensimModelContext receives absstract_model (ode structure) during its initialization.

precision (class: PrecisionContext)
numeric (class: StanModelContext)
structure (class: VensimModelContext)

For prey-predator model's data2darws (different from draws2data e.g. _obs family), stan_model_context consists of

StanModelContext(

	initial_time=0.0, 

	integration_times=array([0.01,.., 0.58]), 

	stan_data = {
		'process_noise_uniform_driving': StanDataEntry(data_name='process_noise_uniform_driving', stan_type='vector[20]'), 
                'process_noise_scale': StanDataEntry(data_name='process_noise_scale', stan_type='real'), 
		'prey_obs': StanDataEntry(data_name='prey_obs', stan_type='vector[20]'), 
		'predator_obs': StanDataEntry(data_name='predator_obs', stan_type='vector[20]')}, 

	sample_statements
		=[<stanify.builders.stan_model.SamplingStatement object at 0x14b94ae30>, 
		  <stanify.builders.stan_model.SamplingStatement object at 0x14b94a980>, 
		  <stanify.builders.stan_model.SamplingStatement object at 0x14b94ae60>, 
		  <stanify.builders.stan_model.SamplingStatement object at 0x14b94a740>, 
		  <stanify.builders.stan_model.SamplingStatement object at 0x14b94aef0>], 

	exposed_parameters={'pred_birth_frac', 'prey_birth_frac',  'process_noise_scale', 'time_step'}, 
	all_stan_variables={'prey_obs', 'predator_obs', 'prey_birth_frac', 'pred_birth_frac',  'm_noise_scale'}
)

SamplingStatement consists of estiatmated parameter's distribution info.

Below is VensimModelContext structure.

class VensimModelContext:
    def __init__(self, abstract_model):
        self.variable_names = set()  # stanified
        self.stock_variable_names = set()
        self.abstract_model = abstract_model

        # Some basic checks to make sure the AM is compatible
        assert len(abstract_model.sections) == 1, "Number of sections in AbstractModel must be 1."

        for element in abstract_model.sections[0].elements:
            assert len(element.components) == 1, f"Number of components in AbstractElement must be 1, but {element.name} has {len(element.components)}"
            self.variable_names.add(vensim_name_to_identifier(element.name))

        for element in abstract_model.sections[0].elements:
            for component in element.components:
                if isinstance(component.ast, IntegStructure):
                    self.stock_variable_names.add(vensim_name_to_identifier(element.name))
                    break

test new interface

list of test:

data/lookup structure with subscript
different combination of driving data, estimated and assumed parameters
parameter's lower and upper bound

tested, not pass

how to estiamte adj_frac1? the following didn't work

v2s_code = """
ss_obs[timesteps] ~ normal(ss[timesteps], sigma);
adj_frac1<lower=0> ~ normal(0.2, 0.02);
sigma<lower=0> ~ normal(0.1, 0.01);
"""

how to use two data variables? passing ["ss_obs", "s_obs"] to Vensim2Stan() doesn't work well
when data structure has exo_demand[reg]

Functions declared twice with two draws2data2draws in one model

Semantic error in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/Inven_6est/Inven_6est_functions.stan', line 65, column 0, included from
'/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/Inven_6est/Inven_6est_draws2data.stan', line 2, column 0:
   -------------------------------------------------
    63:  }
    64:  
    65:  real lookupFunc__table_for_order_fulfillment(real x){
         ^
    66:      // x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
    67:      // y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
   -------------------------------------------------

Function 'lookupFunc__table_for_order_fulfillment' has already been declared to for signature 
(real) => real
make: *** [/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/Inven_6est/Inven_6est_draws2data.hpp] Error 1

Command ['make', 'STANCFLAGS+=--include-paths=/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/Inven_6est', '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/Inven_6est/Inven_6est_draws2data']
	error during processing No such file or directory

when I checked stan function file, functions were declared twice

real lookupFunc__table_for_order_fulfillment(real x){
    // x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
    // y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
    real slope;
    real intercept;

    if(x <= 0.2){
        intercept = 0.0;
        slope = (0.2 - 0.0) / (0.2 - 0.0);
        return intercept + slope * (x - 0.0);
    }
   ...
    else if(x <= 2.0){
        intercept = 1.0;
        slope = (1.0 - 1.0) / (2.0 - 2.0);
        return intercept + slope * (x - 2.0);
    }
    return 1.0;
}

real lookupFunc__table_for_order_fulfillment(real x){
    // x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
    // y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
    real slope;
    real intercept;

    if(x <= 0.2){
        intercept = 0.0;
        slope = (0.2 - 0.0) / (0.2 - 0.0);
        return intercept + slope * (x - 0.0);
    }
   ...
    else if(x <= 2.0){
        intercept = 1.0;
        slope = (1.0 - 1.0) / (2.0 - 2.0);
        return intercept + slope * (x - 2.0);
    }
    return 1.0;
}

// Begin ODE declaration
real dataFunc__customer_order_rate(real time, real time_step){
    // DataStructure for variable customer_order_rate
    real slope;
    real intercept;

    if(time <= time_step * 1){
        intercept = 146376;
        real local_time_step = time_step * 1 - time_step * 0;
        slope = (147079 - 146376) / local_time_step;
        return intercept + slope * (time - time_step * 0);
    }
   ...
    else if(time <= time_step * 19){
        intercept = 183682;
        real local_time_step = time_step * 19 - time_step * 18;
        slope = (183318 - 183682) / local_time_step;
        return intercept + slope * (time - time_step * 18);
    }
    return 183318;
}

real dataFunc__process_noise_uniform_driving(real time, real time_step){
    // DataStructure for variable process_noise_uniform_driving
    real slope;
    real intercept;

    if(time <= time_step * 1){
        intercept = 0.1236860198383215;
        real local_time_step = time_step * 1 - time_step * 0;
        slope = (-0.35346438681712344 - 0.1236860198383215) / local_time_step;
        return intercept + slope * (time - time_step * 0);
...
    else if(time <= time_step * 19){
        intercept = 0.38848170528082115;
        real local_time_step = time_step * 19 - time_step * 18;
        slope = (-0.4071190492640847 - 0.38848170528082115) / local_time_step;
        return intercept + slope * (time - time_step * 18);
    }
    return -0.4071190492640847;
}

real dataFunc__customer_order_rate(real time, real time_step){
    // DataStructure for variable customer_order_rate
    real slope;
    real intercept;

    if(time <= time_step * 1){
        intercept = 146376;
        real local_time_step = time_step * 1 - time_step * 0;
        slope = (147079 - 146376) / local_time_step;
        return intercept + slope * (time - time_step * 0);
    }
    ...
    else if(time <= time_step * 19){
        intercept = 0.38848170528082115;
        real local_time_step = time_step * 19 - time_step * 18;
        slope = (-0.4071190492640847 - 0.38848170528082115) / local_time_step;
        return intercept + slope * (time - time_step * 18);
    }
    return -0.4071190492640847;
}

vector vensim_ode_func(real time, vector outcome, real time_step, real wip_adjustment_time, real safety_stock_coverage, real target_delivery_delay, real minimum_order_processing_time, real process_noise_scale, real manufacturing_cycle_time, real inventory_adjustment_time){
    vector[7] dydt;  // Return vector of the ODE function

    // State variables
    real expected_order_rate = outcome[1];
    real process_noise = outcome[2];
    real work_in_process_inventory = outcome[3];
    real backlog = outcome[4];
    real production_rate_stocked = outcome[5];
    real inventory = outcome[6];
    real production_start_rate_stocked = outcome[7];

    real desired_inventory_coverage = minimum_order_processing_time + safety_stock_coverage;
    real desired_inventory = desired_inventory_coverage * expected_order_rate;
    real adjustment_from_inventory = (desired_inventory - inventory) / inventory_adjustment_time;
    real desired_production = fmax(0, expected_order_rate + adjustment_from_inventory);
    real desired_wip = manufacturing_cycle_time * desired_production;
    real adjustment_for_wip = (desired_wip - work_in_process_inventory) / wip_adjustment_time;
    real desired_production_start_rate = fmax(0, desired_production + adjustment_for_wip);
    real production_start_rate = fmax(0, desired_production_start_rate);
    real desired_minus_shadow_psr = production_start_rate - production_start_rate_stocked;
    real production_start_rate_stocked_change_rate = desired_minus_shadow_psr / time_step;
    real desired_shipment_rate = backlog / target_delivery_delay;
    real production_rate = work_in_process_inventory / manufacturing_cycle_time * fmax(0, 1 + process_noise);
    real desired_minus_shadow_pr = production_rate - production_rate_stocked;
    real production_rate_stocked_change_rate = desired_minus_shadow_pr / time_step;
    real production_rate_stocked_dydt = production_rate + production_rate_stocked_change_rate;
    real time_to_average_order_rate = 8;
    real change_in_exp_orders = (dataFunc__customer_order_rate(time, time_step) - expected_order_rate) / time_to_average_order_rate;
    real maximum_shipment_rate = inventory / minimum_order_processing_time;
    real order_fulfillment_ratio = lookupFunc__table_for_order_fulfillment(maximum_shipment_rate / desired_shipment_rate);
    real correlation_time_over_time_step = 100;
    real white_noise = 4.89 * correlation_time_over_time_step ^ 0.5 * dataFunc__process_noise_uniform_driving(time, time_step) * process_noise_scale;
    real shipment_rate = desired_shipment_rate * order_fulfillment_ratio;
    real order_fulfillment_rate = shipment_rate;
    real order_rate = dataFunc__customer_order_rate(time, time_step);
    real backlog_dydt = order_rate - order_fulfillment_rate;
    real white_minus_process = white_noise - process_noise;
    real correlation_time = time_step * correlation_time_over_time_step;
    real process_noise_change_rate = white_minus_process / correlation_time;
    real expected_order_rate_dydt = change_in_exp_orders;
    real production_start_rate_stocked_dydt = production_start_rate + production_start_rate_stocked_change_rate;
    real process_noise_dydt = process_noise_change_rate;
    real work_in_process_inventory_dydt = production_start_rate - production_rate;
    real inventory_dydt = production_rate - shipment_rate;

    dydt[1] = expected_order_rate_dydt;
    dydt[2] = process_noise_dydt;
    dydt[3] = work_in_process_inventory_dydt;
    dydt[4] = backlog_dydt;
    dydt[5] = production_rate_stocked_dydt;
    dydt[6] = inventory_dydt;
    dydt[7] = production_start_rate_stocked_dydt;

    return dydt;
}

Exposed parameter update error

I keep facing an error where driving data (one we should supply numeric value with lookup structure) is written twice. I wonder whether this line of code should be fixed to

self.stan_model_context.exposed_parameters.update(name)

not

self.stan_model_context.exposed_parameters.update(used_variable_names)

could you please check? @Dashadower

set_prior is sensitive to the input order

@Dashadower, regarding adding one layer for hierarchical model by passing "_loc" parameter as string current implementation is sensitive to the order of input. Do you have any go idea to fix this?

When the input order is

    ("adj_frac1_loc", "normal", .25, .25* .1, 0), # order
    ("adj_frac1", "normal", "adj_frac1_loc", .25 *.1, 0),
    ("adj_frac2", "normal", .125,.125 *.1, 0),

it works well, but when adj_frac1_loc appears before its prior (the third line),

    ("adj_frac1", "normal", "adj_frac1_loc", .25 *.1, 0),
    ("adj_frac2", "normal", .125,.125 *.1, 0),
    ("adj_frac1_loc", "normal", .25, .25* .1, 0),

returns the following error:

/Users/hyunjimoon/Dropbox/stanify/venv/bin/python /Users/hyunjimoon/Dropbox/stanify/vignette/test.py 
Traceback (most recent call last):
  File "/Users/hyunjimoon/Dropbox/stanify/vignette/test.py", line 71, in <module>
    idata_orig, model = draws2data2draws('../vensim_models/inventory/hier2_stock.mdl', setting, precision, numeric, prior, output_format)
  File "/Users/hyunjimoon/Dropbox/stanify/stanify/calibrator/draws_data_mapper.py", line 72, in draws2data2draws
    model = transform_input(vensim, setting, precision, numeric, prior, idata_kwargs)
  File "/Users/hyunjimoon/Dropbox/stanify/stanify/calibrator/draws_data_mapper.py", line 206, in transform_input
    model.set_prior(est_param[0], est_param[1], est_param[2], est_param[3], lower = est_param[4])
  File "/Users/hyunjimoon/Dropbox/stanify/stanify/builders/stan_model.py", line 180, in set_prior
    raise Exception(f"{sample_string} : '{name}' doesn't exist in the Vensim model or the Stan model!")
Exception: adj_frac1 ~ normal(adj_frac1_loc, 0.025) : 'adj_frac1_loc' doesn't exist in the Vensim model or the Stan model!

Plot design for hierarchySBC

In posterior_check function, construct structure which receive prior_sample to generated quantities block loglik. Graphical checks will allow users to decide the degree of hierarchy (e.g. how many levels? should sd of measurement model be separate for each groups?).

Multiindex after stacking chain, draw for sbc.prior inferencedata

After upgrading to the newest arviz, adding create_index = False during stack, (chain, draw) still remains in the produced dataset (sbc.prior).
why don't we have region as coordinate?

In specific, I am having difficulty understanding what the function of .mean('prior_draw') in this setting. When I add this command, things seem much better (second figure below).

Increase S from one to many using inferencedata coordinates

The following from draws2data2draws function would allow obs_xr to have prior_draws coordinates on top of draw, chain. This would allow easier plot using arviz library.

   obs_xr = draws2data(model, numeric, iter_sampling=S)

    prior_pred_check(setting)
    numeric["process_noise_scale"] = 0.0
    obs_xr.assign_coords({'prior_draw': [s for s in range(S)]})

    for s in range(S):
        obs_xr_s = obs_xr[setting_obs].sel(draw=s)
        obs_dict_s = {k: v.values.flatten() for (k, v) in obs_xr_s[setting_obs].items()}
        for key, value in obs_dict_s.items():
            numeric[key] = value

        for target_name in setting['target_simulated_vector_names']:
            model.update_numeric({f'{target_name}_obs': obs_dict_s[f'{target_name}_obs']})
        model.update_numeric({'process_noise_scale': 0.0})

        posterior_s = data2draws(model, numeric, chains=4, iter_sampling=int(M/4))
        obs_xr['prior_draw' == s].update(posterior_s)

Time index wise plot

@Dashadower related to what we discussed in our meeting, current plot_ppc is as below which need to be updated to be plotted by time (in x axis).