hyunjimoon / dataindm Goto Github PK

Workflow iteration starting from 15.879

Jupyter Notebook 54.08% Python 6.56% R 4.74% MATLAB 1.32% M 0.03% C 3.05% Mathematica 27.90% HTML 2.29% Stan 0.03%

bayesian workflow

dataindm's Introduction

Deamand and Supply of this repository

The short term purpose of this respository is to supply students of 879 seminar series with resources, starting with 15.879 Brining Data Into Dynamic Models (syllabus. Long term goal is to kickoff SilkRoad project (more below). Each are being updated in 879_Week folder and 879Re_Month folder.

Demand of SilkRoad project

Short answer: I wished to connect the lifelong brainchild of the scholars I respect.

Five ideas documented below are not evaporative because of its software (Vensim, SDA, Stan, BATS, SOPS). However, more fundamentally, each are sustained by creator's affection. Like parents, Tom, Rogelio, Andrew, Yaman, Erling revisit their child every so often to document and update. Their thrill and willingness to invest time are few memorable emotions that inspired me. This explains my devotion to this project. I am seeking paths to ensure sustainability of five ideas and reasoned how cultural diffusion on SilkRoad was its engine for over 1,500 years. This unity gave birth to technologies such as glass and paper making, religious ideas and language like 35,000 new words which entered Chinese from Buddhist missionaries.

Moreover, I can see how these ideas once united can contribute to humanity and find it my duty to preserve the heritage of my major: simulation.

Supply of SilkRoad project

Supply based on my research interests, skill set, network in both statistical and system dynamics modeling community can meet the above demand. So why hestiate for the first move?

Vensim, SBC, Stan, SDA, BATS, SOPS softwares each with seminar functions but in different languages is a great obstacle in Bayesian workflow where iteration is the key. With the help of Tom, (me), Andrew, Rogelio, Yaman, Erling who are the leader (and lover) of each software, we invite them on SilkRoad for the better flow.

Step	Output	Software	Symbol	Description	Role, Error	opensource? (language)
1. Translate Perceived `Demand` to Program					Generator (Basis function space builder), Approximation error
	a. Perceived Demand	Vensim	👁 Eye	Reads mental model, Translates to cyclic directed graph `generator`		X (has free version)
	b. Analyzed Demand	SDA	🧠 Brain	Finds dominant cycle of `generator`, Maps with system behavior		O (Mathematica, R)
2. Compute Scientific `Draws`					Discriminator (Hypothesis function space builder), Optimization error
	a. Computed Draws	Stan	🐴 Workhorse	Builds posterior space, Runs HMC, ADVI, BFGS for representative draws		O (Stan connected to Python, R, Julia)
	b. Verified and Validated Draws	SBC	👌 Test	Diagnoses graphically, Calibrates architecture, policy, parameter prior `discriminator`		O (R)
3. Supply `Data` for Rationing					Customized Policy parameter estimation, Statistical error	Customize := Prior_Precision-conditioning (U4, U5 from `User-Program WF`)
	a. Theoretical Policy Parameter	BATS (communicating with Yaman, Gönenç)	🦇 Explore	Specifies policy parameter for demanded behavior		O (Python)
	b. Empirical Policy Parameter	SOPS (communicating with Erling)	🚀 One-shot	Optimizes policy one-shot in stochastic dynamic system		O (Powersim)
-
Iterate and communicate	Hub `pysd`, `readsdr`	PySD, readsdr	🗣 Language			O (Python, R)

Detailed summary of each package is being documented in here
Collaboration with PySD team is happening through this PR
Case studies may be presented in cookbook format like here (tbd)
Users can cook their decision by programming demand, sampling scientific draws, collecting sensible data

dataindm's People

Contributors

Stargazers

Watchers

dataindm's Issues

User-Program Workflow with Specified Programs

Question/Feedback needed:

To Yaman.

Does the order of [Vensim-SDA]-[Stan-SBC]-[BATS-SOPS]? Detailed description in here looks reasonable to you?
From your talk, "Policy analysis and design: Policy parameter(s) specification based on desired behavior patterns" -- can every policy expressed in the form of Policy parameter?
How may I understand the difference between left figure's 2.a SDA (Rogelio's structural dominance analysis) and figure's 3.a BATS? My tweaks is being updated in this issue.

To Others.
4. optimization err only in step 3? or step 2 as well? The domain of opt err in step 2 is model parameter (P4.draws) while step 3 is policy parameter
5. statistical err (related to U5.a precision) only in step 2? or step 3 as well? Step 2 precisions are for ODE integration while step 3 (policy specification based on desired behavior) are for ---
optimization err := d(f(x), f(x*)) where x* is the true optimal and x is what optimization alg. found as optimal
statistical err := Monte Carlo error i.e. replacing integration with finite sample's average
draws := sample from parameter's posterior distribution by U5.b posterior approximator

Abstracting model prior and family for hierarchical ODE in U34

Goal: abstract U34 in this table to the highest level

How often does the measurement model (family) change? Can all (or some) changes be covered with prior change (e.g. adding hierarchy; family poisson to neg_binom is the same with gamma prior for rate)?
What scenarios does user decide to change prior distribution (not prior parameter)?
Could you give some feedback on the following prior type classification (details here)?

Confused about brms family quas

quasi(link = "identity", variance = "constant")
quasibinomial(link = "logit")
quasipoisson(link = "log")

Prevent `est_param`. `ass_param` to be written inside function block

inventory_adjustment_time which is est_param and safety_stock_coverage which is ass_param automatically included in function block as follows because users declared its initial value in vensim.

        real inventory_adjustment_time = 8;

Instead, est_param should be included as input argument of vensim_func.

However, I don't know whether it is better between the following two options for ass_param:
a. automatically written inside function block
b. users explicitly pass as predictor (like D, V in ode vignette)

functions {
  vector one_comp_mm_elim_abs(real t, vector y,
                              real k_a,  real K_m, real V_m, // `est_param`
                              real D, real V) { // `ass_param`
    vector[1] dydt;

    real dose = 0;
    real elim = (V_m / V) * y[1] / (K_m + y[1]);

    if (t > 0)
      dose = exp(- k_a * t) * D * k_a / V;

    dydt[1] = dose - elim;

    return dydt;
  }
}

Compared to stan model where we would have less than ten parameters (both assumed and estimated), large sd models like world dynamics would have at least forty parameters which makes it less error-prone to go with option a.

Modularize function block

1. function modularizing

a. When modularizing function (only using this line, variable name sanitizing should be called within StanFunctionBuilder. But then should this sanitization happen in all block builders? Perhaps making a sanitizing class might help?

b. functions{} should be erased

2. function return type from vector to real?

detail here

Formula and Auto prior from brms

For discussion on 15/08/2022:

How will users supply priors?
How will users supply the measurement model (likelihood function/sampling statement)?

Automate hierarchical prior R2-D2-M2

From this paper, how much implementable is R2-D2-M2 prior? (any components that might not work under ODE model)

joint regularization of all regression coefficients in
Bayesian linear multilevel models can be carried out. To do so, we have have proposed
the R2-D2-M2 prior that imposes a prior on the global coefficient of determination R2
and propagates the resulting regularization to the individual regression coefficients via
a Dirichlet Decomposition

Measurement model

A critical component for performing inference (either from a frequentist or Bayesian perspective) is the definition of the measurement model. This component connects the System Dynamics model (X) with data (y). Let's denote the measurement model by $\pi(y | X(\theta))$, where $\pi$ is a probabilistic function & $\theta$ represents a vector of model parameters. If we fix $\theta$, we refer to the measurement model as the sampling distribution. From this distribution, we can obtain simulated measurements. If we fix y, we refer to the measurement model as the likelihood function which acts as a device to determine whether a measurement is close to its expected value.

Since the parameterisation of the measurement model is not unique given the various choices of probabilistic distributions, users are required to input the appropriate probabilistic function for the System Dynamics model. The challenge from the developers' point of view is how to make this process as seamlessly as possible. The purpose of this issue is to identify & discuss user-friendly approaches to get the specification of measurement models, & further default choices. For instance, the specification of the negative binomial distribution entails an additional parameter (phi).

A proposal for this specification is the requirement for users to employ Stan language:
"y ~ neg_binomial_2(stock_name, phi)"

Inference using Stan

Goal: make first mockup, communicate with pysd dev. team (due 7/31)

Done

Drafting information map between SD models with Hierarchical Bayesian models: table here
Drafting computational map between.xmile(mdl) to .stan: detailed progress in Shinyoung's pysd pj board
classified SD variables into endogenous and exogenous and defined seven datatypes (endo: observed and simulated outcome (level var), simulated latent (aux. var) / exo: assumed parameter, estimated parameter, parapredictor, )parameters into time, rate, ratio, function

Doing

Jair's review on info.map and Angie's review on comp.map
Shinyoung and Angie run Stan's .sample with externally generated predictor data
Jair and Angie document Inventory model like that of prey-pradator and SIR model below:
-- SIR-family model:stancode for SIR (in R)
-- Lotka-Volterra (LV) prey-predator model: notebook, inference_stan, sbc_stan (in Python)

TODO

refine lookup function (possible connection with nonlinear fitting python scripts - more detail in pysd pj board)

Data along the time axis

If we're integrating from time 0 to T, say we want some variable X where we know the observed values for each discrete timestep. Stan needs the variable to be defined along the entire [0, T] real interval meaning some interpolation is necessary.

The obvious solution to this is to just use piecewise linear approximation much like LOOKUP. The minor problem/inconvenience would be the number of conditional branches being as man￦ as the number of discrete timesteps.

Parsing lookup function

Runnig the first script returns multiple (nine) same functions. -> This turned out to be the kernel's problem. Another bug of replacing function's name left in the comment below: #13 (comment)

vf = VensimFile("vensim_models/demand-supply_wolookup.mdl")
vf.parse()

am = vf.get_abstract_model()

stan_builder = StanModelBuilder(am)
stan_builder.print_variable_info()
print(stan_builder.create_stan_program(["demand"]))

original name                 stan variable name            is stock
--------------------------------------------------------------------
Supply Start Rate Desired     supply_start_rate_desired     
Order Fulfilment Ratio        order_fulfilment_ratio        
Safety Stock Coverage         safety_stock_coverage         
Order Fulfillment Rate        order_fulfillment_rate        
Backlog Adj Rate Desired      backlog_adj_rate_desired      
Supply Line Desired           supply_line_desired           
Critical Ratio                critical_ratio                
Backlog Desired               backlog_desired               
Max Shipment Rate             max_shipment_rate             
Shipment Rate                 shipment_rate                 
Demand Rate Adj Rate          demand_rate_adj_rate          
Inventory Safety Time Desired inventory_safety_time_desired 
Supply Rate Desired           supply_rate_desired           
Demand Rate                   demand_rate                   
Forecasted Demand Rate        forecasted_demand_rate        V
Inventory Adj Time            inventory_adj_time            
Supply Line Adj Time          supply_line_adj_time          
Shipment Lead Time            shipment_lead_time            
Backlog Adj Time              backlog_adj_time              
Inventory Adj Rate            inventory_adj_rate            
Supply Line Adj Rate          supply_line_adj_rate          
Backlog                       backlog                       V
Order Rate                    order_rate                    
Supply Rate                   supply_rate                   
Inventory Desired             inventory_desired             
Demand Adj Time               demand_adj_time               
Inventory                     inventory                     V
Supply Lead Time              supply_lead_time              
Supply Start Rate             supply_start_rate             
Supply Line                   supply_line                   V
FINAL TIME                    final_time                    
INITIAL TIME                  initial_time                  
SAVEPER                       saveper                       
TIME STEP                     time_step                     
functions {
    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    real lookupFunc_0(real x){
        # x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
        # y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)
        real slope;
        real intercept;

        if(x <= 0.2)
            intercept = 0.0;
            slope = (0.2 - 0.0) / (0.2 - 0.0);
            return intercept + slope * (x - 0.0);
        else if(x <= 0.4)
            intercept = 0.2;
            slope = (0.4 - 0.2) / (0.4 - 0.2);
            return intercept + slope * (x - 0.2);
        else if(x <= 0.6)
            intercept = 0.4;
            slope = (0.58 - 0.4) / (0.6 - 0.4);
            return intercept + slope * (x - 0.4);
        else if(x <= 0.8)
            intercept = 0.58;
            slope = (0.73 - 0.58) / (0.8 - 0.6);
            return intercept + slope * (x - 0.6);
        else if(x <= 1.0)
            intercept = 0.73;
            slope = (0.85 - 0.73) / (1.0 - 0.8);
            return intercept + slope * (x - 0.8);
        else if(x <= 1.2)
            intercept = 0.85;
            slope = (0.93 - 0.85) / (1.2 - 1.0);
            return intercept + slope * (x - 1.0);
        else if(x <= 1.4)
            intercept = 0.93;
            slope = (0.97 - 0.93) / (1.4 - 1.2);
            return intercept + slope * (x - 1.2);
        else if(x <= 1.6)
            intercept = 0.97;
            slope = (0.99 - 0.97) / (1.6 - 1.4);
            return intercept + slope * (x - 1.4);
        else if(x <= 1.8)
            intercept = 0.99;
            slope = (1.0 - 0.99) / (1.8 - 1.6);
            return intercept + slope * (x - 1.6);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 1.8);
            return intercept + slope * (x - 1.8);
        else if(x <= 2.0)
            intercept = 1.0;
            slope = (1.0 - 1.0) / (2.0 - 2.0);
            return intercept + slope * (x - 2.0);
    }

    # Begin ODE declaration
    vector vensim_func(real time, vector outcome,     real demand    ){
        real forecasted_demand_rate = outcome[1];
        real backlog = outcome[2];
        real inventory = outcome[3];
        real supply_line = outcome[4];

        real demand_adj_time = 3;
        real inventory_adj_time = 3;
        real critical_ratio = 0.8;
        real safety_stock_coverage = 3 * critical_ratio;
        real shipment_lead_time = 3;
        real inventory_safety_time_desired = shipment_lead_time + safety_stock_coverage;
        real inventory_desired = forecasted_demand_rate * inventory_safety_time_desired;
        real inventory_adj_rate = inventory_desired - inventory / inventory_adj_time;
        real supply_rate_desired = forecasted_demand_rate + inventory_adj_rate;
        real supply_line_adj_time = 3;
        real supply_lead_time = 3;
        real supply_line_desired = forecasted_demand_rate * supply_lead_time;
        real supply_line_adj_rate = supply_line_desired - supply_line / supply_line_adj_time;
        real supply_start_rate_desired = fmax(0,supply_rate_desired + supply_line_adj_rate);
        real supply_start_rate = supply_start_rate_desired;
        real demand_rate = random_normal(0,200,100,10,1111);
        real demand_rate_adj_rate = demand_rate - forecasted_demand_rate / demand_adj_time;
        real forecasted_demand_rate_dydt = demand_rate_adj_rate;
        real backlog_desired = 0;
        real supply_rate = supply_line / supply_lead_time;
        real backlog_adj_time = 3;
        real backlog_adj_rate_desired = backlog - backlog_desired / backlog_adj_time;
        real max_shipment_rate = inventory / shipment_lead_time;
        real order_fulfilment_ratio = fmin(1,max_shipment_rate / backlog_adj_rate_desired + 0.001);
        real shipment_rate = backlog_adj_rate_desired * order_fulfilment_ratio;
        real order_fulfillment_rate = shipment_rate;
        real order_rate = demand_rate;
        real supply_line_dydt = supply_start_rate - supply_rate;
        real backlog_dydt = order_rate - order_fulfillment_rate;
        real inventory_dydt = supply_rate - shipment_rate;

        return {forecasted_demand_rate_dydt, backlog_dydt, inventory_dydt, supply_line_dydt};
    }
}
data{
}
transformed data{
}
parameters{
}
transformed parameters {
    real forecasted_demand_rate_initial = random_normal(0,200,100,10,1111);
    real backlog_initial = 0;
    real inventory_initial = random_normal(0,200,100,10,1111) * 3 + 3 * 0.8;
    real supply_line_initial = random_normal(0,200,100,10,1111) * 3;
    vector[4] initial_outcome = {forecasted_demand_rate_initial, backlog_initial, inventory_initial, supply_line_initial};
    array[] vector integrated_result = integrate_ode_rk45(vensim_func, initial_outcome, initial_time, times, demand);
}
model{
}
generated quantities{
}

Measure distance between classified behavior space

Using ISTS or BATS, behavior is classified to 25 codes (subclasses branched from 6 classes) but how close these classes are to each other? For one, would the two codes belonging in the same class always mean closer distance than those in different classes?

Some interface design decisions regarding input data, outcome in data2draws

A rough implementation of data2draws is partly implemented. You can write the following python code:

model = StanVensimModel("ds_white_sterman", am, 0.0, list(range(0, 10)))  # 0.0 is initial time, range(0, 10) is the integrations times
model.set_prior("inventory_adjustment_time", "normal", 0, 1)
model.set_prior("minimum_order_processing_time", "normal", 0, 1)

This (currently) results in the following generated code:

#include ds_white_sterman_functions.stan

data{
}

transformed data{
    real initial_time = 0.0;
    int T = 10;
    array[T] real times = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
}

transformed parameters {
    # Initial ODE values
    real inventory_initial = 2 + 2 * 10000;
    real expected_order_rate_initial = 10000;
    real work_in_process_inventory_initial = 8 * max(0,10000 + 2 + 2 * 10000 - 2 + 2 * 10000 / 8);

    vector[3] initial_outcome;  # Initial ODE state vector
    initial_outcome[1] = inventory;
    initial_outcome[2] = expected_order_rate;
    initial_outcome[3] = work_in_process_inventory;

    vector[3] integrated_result[T] = ode_rk45(vensim_ode, initial_outcome, initial_time, times, minimum_order_processing_time, inventory_adjustment_time);
}

parameters{
    real inventory_adjustment_time;
    real minimum_order_processing_time;
}

model{
    inventory_adjustment_time ~ normal(0, 1);
    minimum_order_processing_time ~ normal(0, 1);
}

The problem is how we're going to expose vector[3] integrated_result[T] to the user. There are 2 ways the user would want to access it. 1. along the time axis(which results in a array of length T with a single stock) 2. for a single timestep(which results in a vector of length 3)

Another problem is linking the codegen with data specified by the user. I'm not sure what form the data would have to be inputted from the user. If data was specified, we'd have to construct the data block that matches the inputted data.

Pass Pysd CI test

@enekomartinmartinez commented here to include the test for the new builder so pysd team can send jobs to GitHub Actions for running CI test.

Parameter indexing in function and gq block

which is better btw theta[2] vs alpha, beta? as function input argument?
could we vectorize in gq block w.o. extra td block as the solution below?

    theta_tilde[1] = normal_rng(1, 0.5); //TODO U5
    theta_tilde[3] = normal_rng(1, 0.5); //TODO U5
    theta_tilde[2] = normal_rng(0.05, 0.05); //TODO U5
    theta_tilde[4] = normal_rng(0.05, 0.05); //TODO U5

sol. suggested in here

transformed data {
  int N = 10;
  vector[N] ones_N = rep_vector(1, N);
}

generated quantities {
  real yrep [N] = normal_rng(ones_N * 1, ones_N * 10);  
}

Implement Dan's pc-prior

Dan wrote a blog on 9/3/2022 on updated version of his previous paper on pc-prior which is practical as users usually don’t have to derive them themselves. Moreover, a lot of that complexity is the price we pay for dealing with densities. We think that this is worth it and the lesson that the parameterisation that you are given may not be the correct parameterisation to use when specifying your prior is an important one! Below are four principles from Dan's blog which includes specific implementation.

Occam’s razor: We have a base model that represents simplicity and we prefer our base model.
Measuring complexity: We define the prior using the square root of the KL divergence between the base model and the more flexible model. The square root ensures that the divergence is on a similar scale to a distance, but we maintain the asymmetry of the divergence as as a feature (not a bug).
Constant penalisation: We use an exponential prior on the distance scale to ensure that our prior mass decreases evenly as we move father away from the base model.
User-defined scaling: We need the user to specify a quantity of interest and a scale . We choose the scaling of the prior so that . This ensures that when we move to a new context, we are able to modify the prior by using the relevant information about .

Minor features to be added

adding column that expresses value for the variables with constant given in .mdl file
change array type to vector[#of observed stock]

    array[] vector integrated_result ->   vector<lower=0>[2] integrated_result_tilde[N]

Compare SDA and BATS

Flow variable as observed data

Anyone (@jandraor @Dashadower @hazhirr) feel free to comment, please!

@hazhirr mentioned flow variables can be observed data (a.k.a driving data) different from my assumption that only stock can be observed. I wonder whether this can be modeled under the original framework where all family distribution is used to penalize the difference between simulated stock variable (integrated_result) and observed stock variable.

For instance, assume the model has two of each stock variables (S1,2), flow variables (F1,2), parameters (P1,2). We have three scenarios:

Data for S1
Data for S1, F1
Data for S12, F1

Q1. Based on the insight @jandraor shared that dynamics is the function of only two: exogenous parameters and stocks, may I call each of the three as under-identified, well-identified, over-identified?

Q2. If so, infinite number and no parameter can be expected from the under and over identified case. Would giving prior be the solution to this identification problem?be returned for under-identified problem and would no parameter be returned for 1 and what could we expect from the results of under and over identified?

Q3. For 1, is there any way we can pre-calculate S2 using S1, F1 then compare this with the result purely integrated from the start? Integration will happen in transformed parameter block whereas comparison will be in model block. @Dashadower and I concluded there is no easy way for this but wanted to check with you, @jandraor.

Q4. When we say state variables is this equivalent with stock variable?

Parallelize loglik computation using reduce_sum

Write a new function in a for loop fashion so that loglik computation can be parallelized (using +'s associative). New component for function block, change ~ to += for model and gq block

//function
real partial_sum_lpmf(int[] slice_n_redcards,
                        int start, int end,
                        int[] n_games,
                        vector rating,
                        vector beta) {
    return binomial_logit_lupmf(slice_n_redcards |
                               n_games[start:end],
                               beta[1] + beta[2] * rating[start:end]);
  }
// model
...
  target += reduce_sum(partial_sum_lupmf, n_redcards, grainsize,
                       n_games, rating, beta);

Ref
Start from Sebastian's Gelblog intro.

A case study which adapts Richard McElreath’s intro to map_rect for reduce_sum
User manual introduction to reduce_sum parallelism with a simple example as well: 23.1 Reduce-Sum
Function reference: 9.4 Reduce-Sum Function

Vignette for "Uncertainty Structure of Dynamic Models: Capturing Prey and Demand with Predator and Supply"

Cyclic models

Obviously you can expect some models to be cyclic, like the stock affecting flow. That case should be the only possible cyclic structure, since when that is translated to stan it would represent that the atock value is equal to the previous time's state values.

This is because we need sequential evaluation in Stan; a cyclic component prevents this from working.