data4dm / bayessd Goto Github PK

Data for Decision, Affordable Analytics for All

Mathematica 15.34% R 2.61% MATLAB 0.72% M 0.02% C 1.67% Python 4.00% HTML 1.46% Stan 0.03% Jupyter Notebook 72.36% JavaScript 1.67% Roff 0.01% SAS 0.12%

bridge educational-project workflow system-dynamics

bayessd's People

Contributors

Stargazers

Watchers

bayessd's Issues

Caching and prevent rebuild

function file need not be regenerated
store fits into .nc? (how to structuralize this)

prototype1 of tech (prob.comp.) in market (bayes.ent)

after setting prior on

bayesdb and debriefing BE conf, charlie kindly became a yoda (do (or do not) there's not try) which forced ("may the force be with you") me to make prototype1 below. before coding, would like to get charlie's test!

higher resolution: idea_exp.pdf

In commit ea6c358 We included DataStructureCodeGenWalker class, which unlike LookupStructureCodeGenWalker doesn't need name dictionary as AST node comes with its name as below which is the output of printing element and component with this script.

name: Initial Customer Order Rate Data
length: 1
type: Data
subtype: Normal
subscript: ([], [])
DataStructure

There were two implementation options: vector times vs time scalar as an input. We tried the first, but due to [Q1 what was the problem? ] we ended up with the second. New things I learned are:

Dependency graph and abstract syntax tree are different
existence of ReferenceStructure
subscripts are included in pysd.builder which we can use to proceed to hierarchical model
noise seed is included as normal type [Q2 role of normal type?] in AST

@Dashadower On top of the two questions above, Q3: could you explain how you used ReferenceStructure for topological sort (e.g. code with brief comment)? Not priority but I think we could benefit from some documentation within functions. To be specific, comparing AST node named Inventory and vensim interface teaches us how IntegStructure,ArithmeticStructure, ReferenceStructure map to INTEG, -, using variable in vensim. However, my brain is not following the mechanism of topological sort to create dependency graph (construction and walking on it?)

name: Inventory
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
IntegStructure:
	ArithmeticStructure:
	 ['-'] (ReferenceStructure(reference='production_rate', subscripts=None), ReferenceStructure(reference='shipment_rate', subscripts=None)),
	ReferenceStructure:
	 desired_inventory

**********
name: Production Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['/'] (ReferenceStructure(reference='work_in_process_inventory', subscripts=None), ReferenceStructure(reference='manufacturing_cycle_time', subscripts=None))
**********
name: Customer Order Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
CallStructure:
	ReferenceStructure:
	 max
	(
		0
		,
		ReferenceStructure:
			 initial_customer_order_rate_data
			)
**********
name: Shipment Rate Measured Data
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['*'] (CallStructure(function=ReferenceStructure(reference='random_normal', subscripts=None), arguments=(0, 2, 1, ReferenceStructure(reference='ship_measurement_noise_scale', subscripts=None), ReferenceStructure(reference='noise_seed', subscripts=None))), ReferenceStructure(reference='shipment_rate', subscripts=None))
**********
name: Ship Measurement Noise Scale
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
0.1
**********
name: Prod Start Measurement Noise Scale
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
0.1
**********
name: Production Start Rate Measured Data
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['*'] (CallStructure(function=ReferenceStructure(reference='random_normal', subscripts=None), arguments=(0, 2, 1, ReferenceStructure(reference='prod_start_measurement_noise_scale', subscripts=None), ReferenceStructure(reference='noise_seed', subscripts=None))), ReferenceStructure(reference='production_start_rate', subscripts=None))
**********
name: Adjustment for WIP
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['/'] (ArithmeticStructure(operators=['-'], arguments=(ReferenceStructure(reference='desired_wip', subscripts=None), ReferenceStructure(reference='work_in_process_inventory', subscripts=None))), ReferenceStructure(reference='wip_adjustment_time', subscripts=None))
**********
name: Adjustment from Inventory
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['/'] (ArithmeticStructure(operators=['-'], arguments=(ReferenceStructure(reference='desired_inventory', subscripts=None), ReferenceStructure(reference='inventory', subscripts=None))), ReferenceStructure(reference='inventory_adjustment_time', subscripts=None))
**********
name: Backlog
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
IntegStructure:
	ArithmeticStructure:
	 ['-'] (ReferenceStructure(reference='order_rate', subscripts=None), ReferenceStructure(reference='order_fulfillment_rate', subscripts=None)),
	ArithmeticStructure:
	 ['*'] (ReferenceStructure(reference='order_rate', subscripts=None), ReferenceStructure(reference='target_delivery_delay', subscripts=None))
**********
name: Change in Exp Orders
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['/'] (ArithmeticStructure(operators=['-'], arguments=(ReferenceStructure(reference='customer_order_rate', subscripts=None), ReferenceStructure(reference='expected_order_rate', subscripts=None))), ReferenceStructure(reference='time_to_average_order_rate', subscripts=None))
**********
name: Desired Inventory
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['*'] (ReferenceStructure(reference='desired_inventory_coverage', subscripts=None), ReferenceStructure(reference='expected_order_rate', subscripts=None))
**********
name: Desired Inventory Coverage
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['+'] (ReferenceStructure(reference='minimum_order_processing_time', subscripts=None), ReferenceStructure(reference='safety_stock_coverage', subscripts=None))
**********
name: Desired Production
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
CallStructure:
	ReferenceStructure:
	 max
	(
		0
		,
		ArithmeticStructure:
			 ['+'] (ReferenceStructure(reference='expected_order_rate', subscripts=None), ReferenceStructure(reference='adjustment_from_inventory', subscripts=None)))
**********
name: Desired Production Start Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['+'] (ReferenceStructure(reference='desired_production', subscripts=None), ReferenceStructure(reference='adjustment_for_wip', subscripts=None))
**********
name: Desired Shipment Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['/'] (ReferenceStructure(reference='backlog', subscripts=None), ReferenceStructure(reference='target_delivery_delay', subscripts=None))
**********
name: Desired WIP
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['*'] (ReferenceStructure(reference='manufacturing_cycle_time', subscripts=None), ReferenceStructure(reference='desired_production', subscripts=None))
**********
name: Expected Order Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
IntegStructure:
	ReferenceStructure:
	 change_in_exp_orders
	,
	ReferenceStructure:
	 customer_order_rate
	
**********
name: Initial Customer Order Rate Data
length: 1
type: Data
subtype: Normal
subscript: ([], [])
DataStructure
**********
name: Inventory
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
IntegStructure:
	ArithmeticStructure:
	 ['-'] (ReferenceStructure(reference='production_rate', subscripts=None), ReferenceStructure(reference='shipment_rate', subscripts=None)),
	ReferenceStructure:
	 desired_inventory
	
**********
name: Inventory Adjustment Time
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
8
**********
name: Manufacturing Cycle Time
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
8
**********
name: Maximum Shipment Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['/'] (ReferenceStructure(reference='inventory', subscripts=None), ReferenceStructure(reference='minimum_order_processing_time', subscripts=None))
**********
name: Prod Measurement Noise Scale
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
0.1
**********
name: Minimum Order Processing Time
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
2
**********
name: Noise Seed
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
2
**********
name: Order Fulfillment Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ReferenceStructure:
	 shipment_rate
	
**********
name: Order Fulfillment Ratio
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
CallStructure:
	ReferenceStructure:
	 table_for_order_fulfillment
	(
		ArithmeticStructure:
			 ['/'] (ReferenceStructure(reference='maximum_shipment_rate', subscripts=None), ReferenceStructure(reference='desired_shipment_rate', subscripts=None)))
**********
name: Order Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ReferenceStructure:
	 customer_order_rate
	
**********
name: Production Rate Measured Data
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['*'] (CallStructure(function=ReferenceStructure(reference='exp', subscripts=None), arguments=(ArithmeticStructure(operators=['*'], arguments=(ReferenceStructure(reference='prod_measurement_noise_scale', subscripts=None), CallStructure(function=ReferenceStructure(reference='random_normal', subscripts=None), arguments=(-6, 6, 0, 1, ReferenceStructure(reference='noise_seed', subscripts=None))))),)), ReferenceStructure(reference='production_rate', subscripts=None))
**********
name: Production Start Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
CallStructure:
	ReferenceStructure:
	 max
	(
		0
		,
		ReferenceStructure:
			 desired_production_start_rate
			)
**********
name: Safety Stock Coverage
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
2
**********
name: Shipment Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ArithmeticStructure:
	 ['*'] (ReferenceStructure(reference='desired_shipment_rate', subscripts=None), ReferenceStructure(reference='order_fulfillment_ratio', subscripts=None))
**********
name: Table for Order Fulfillment
length: 1
type: Lookup
subtype: Hardcoded
subscript: ([], [])
LookupStructure (interpolate):
	x (0, 2) = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.0)
	y (0, 1) = (0.0, 0.2, 0.4, 0.58, 0.73, 0.85, 0.93, 0.97, 0.99, 1.0, 1.0, 1.0)

**********
name: Target Delivery Delay
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
2
**********
name: Time to Average Order Rate
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
8
**********
name: WIP Adjustment Time
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
2
**********
name: Work in Process Inventory
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
IntegStructure:
	ArithmeticStructure:
	 ['-'] (ReferenceStructure(reference='production_start_rate', subscripts=None), ReferenceStructure(reference='production_rate', subscripts=None)),
	ReferenceStructure:
	 desired_wip
	
**********
name: FINAL TIME
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
100
**********
name: INITIAL TIME
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
0
**********
name: SAVEPER
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
ReferenceStructure:
	 time_step
	
**********
name: TIME STEP
length: 1
type: Auxiliary
subtype: Normal
subscript: ([], [])
0.0625

finish 15s.03 report on pairing, routing for max used NSP

would like to keep pushing Sol-based Sci (10yr) with bayesian workflow as introduced in this and this slide.

synthesize https://github.com/orgs/Data4DM/projects/4/views/1?pane=issue&itemId=63366771&filterQuery=knowl with

⏰startup activity sequence | clockspeed with NUS team

Potential goal of submitting to INFORMS (OCT) poster duedate by June 26th https://meetings.informs.org/wordpress/seattle2024/submit/

updated shiying's data, using this process for decomposing bm action into AT, AC, C, T.

momentum strategy

I want to compare different momentum strategies such as cross-sectional momentum (as given by Jegadeesh and Titman), 52-week high price given , residual momentum, and time-series momentum strategy for a specified period. Could anyone help me in developing the code for this strategy?

Data structure translated but not used in stan function

The function dataFunc__std_normal_data_gamma is translated but not used in the vensim_ode_func.

vector vensim_ode_func(real time, vector outcome, real delta, real beta, real alpha, real gamma){
    vector[2] dydt;  // Return vector of the ODE function

    // State variables
    real predator = outcome[1];
    real prey = outcome[2];

    real alpha_mean = 0.8;
    real beta_mean = 0.05;
    real delta_mean = 0.05;
    real gamma_mean = 0.8;
    real predator_birth_rate = delta * prey * predator;
    real predator_death_rate = gamma * predator;
    real predator_dydt = predator_birth_rate - predator_death_rate;
    real prey_birth_rate = alpha * prey;
    real prey_death_rate = beta * predator * prey;
    real prey_dydt = prey_birth_rate - prey_death_rate;

    dydt[1] = predator_dydt;
    dydt[2] = prey_dydt;

    return dydt;
}

Auto generate loglik in generated quantities block

We need to have loglik in gq block to pass to arviz inference data. This was arviz can take care of loo and other validations.

preparing stancon during summer

Discussed in #230

^{Originally posted by hyunjimoon May 14, 2024}
FiddamanMoon23_BayesSD.pdf

read Homo Ludens

Homo Ludens (playful culture in academia) https://en.wikipedia.org/wiki/Homo_Ludens

how can i make writing papers and reading literature more playful?

Tom Malone's thesis on how to make learning fun (challenge, game, fantasy) here and here

Stochasticity in function block

Relevant to previously discussed topic with @Dashadower which he might have mentioned could be done with C++ coding.

Semantic error in '/Users/hyunjimoon/Dropbox/tolzul/BayesSD/ContinuousCode/5_BayesCalib/stan_files/mngInven_lookup_functions_manual.stan', line 103, column 98, included from
'/Users/hyunjimoon/Dropbox/tolzul/BayesSD/ContinuousCode/5_BayesCalib/stan_files/mngInven_lookup_draws2data_manual.stan', line 2, column 0:
   -------------------------------------------------
   101:      real order_fulfillment_rate = shipment_rate;
   102:      real time_step = 0.0625;
   103:      real white_noise = noise_standard_deviation * 24 * noise_correlation_time / time_step ^ 0.5 * uniform_rng(0,1) - 0.5;
                                                                                                           ^
   104:      real change_in_process_noise = process_noise - white_noise / noise_correlation_time;
   105:      real process_noise_dydt = change_in_process_noise;
   -------------------------------------------------

Random number generators are only allowed in transformed data block, generated quantities block or user-defined functions with names ending in _rng.

15.S03: Metascientific examination of social & behavioral sciences reflection (test:Abdullah Almaatouq)

Team: Abdullah Almaatouq

marginnote3app://note/19FD87D9-71BC-4E59-8649-5D42AC9AB093

Discussed in #187

^{Originally posted by hyunjimoon February 12, 2024}
syllabus

TEPE tested by Josh Lerner

Theoretical and Empirical Perspectives on Entrepreneurship

Mar.19-26

I shared my belief on inspection paradox (whenever the probability of observing a quantity is related to the quantity being observed) can help me model the difference between system-level and (two) agent-level (perceived) uncertainty. I framed it as bilateral information asymmetry, but "different observer type" seems to be more fundamental. Josh recommended a paper (digesting) and also reaching out to Bob Gibbons. Based on my current belief and goal, I made mockup to get Josh's evaluation.

Belief: different types of observer (actor and environment) from inspection paradox is relevant to different ENT dynamics (e.g. Decker et al, guzman-stern)

Goal: understand different ENT dynamics with the help of concrete models of observation and their relation in transportation

Action: prepare mockup to get Josh's evaluation on my goal; start from different observer types which is fundamental cause of inspection paradox. Mockup from process which synthesized traffic flow with Josh's Entrepreneurship and Industry Evolution slide:

Observer/Measurement Type	Entrepreneurial Dynamics (Including Cause of Heterogeneity)	Traffic Flow (Including Cause of Non-Stationary Traffic)
Quantities of Interest (QoI)	Growth	Speed
Global Observer	Macro: May overlook nuances of local markets and trends, leading to generalized growth understanding. Cause: Sector-specific trends and macroeconomic factors can obscure individual dynamics.	Aerial Survey: Observes overall flow, missing finer details. Cause: Environmental factors and broad policies affect overall flow.
Local Observer	Micro: In-depth understanding of specific markets or sectors, identifying unique growth opportunities. Cause: Micro-level factors like firm strategies and local market conditions influence dynamics.	Stationary Observer: Captures detailed, location-specific dynamics. Cause: Localized events and regional policies affect traffic flow in particular areas.
QoI Observed by Global Observer Across Time	Longitudinal Studies: Focuses on evolutionary entrepreneurial activities over time, possibly overlooking cross-sectional differences. Cause: Changes in macro conditions and sector-wide shifts.	Time Mean Speed: Measures dynamics change over time, missing spatial variations. Cause: Infrastructure developments or major policy shifts altering traffic patterns over time.
QoI Observed by Local Observer Across Space	Cross-sectional Studies: Highlights geographical or sectoral growth differences, possibly missing temporal trends. Cause: Geographic and sector-specific heterogeneity, including local economic conditions.	Space Mean Speed: Captures spatial variations in dynamics, missing temporal changes. Cause: Physical characteristics of road network and localized demand fluctuations affect flow.
Cause of Heterogeneity	Variability stems from entrepreneurship types (subsistence vs. transformational), innovation rates, and sector dynamics.	Variations in driver behavior, vehicle types, and travel purposes lead to fluctuating traffic characteristics.
Information Asymmetry	In entrepreneurial ecosystems, startups have more detailed knowledge about their potential, while investors need to screen the best opportunities.	Drivers have more information about their intended routes and timings than the system designed to optimize traffic flow.
Role of Intermediaries	Intermediaries like venture capitalists and incubators help bridge the information gap between investors and startups, facilitating efficient resource allocation.	GPS and traffic management systems act as intermediaries, using real-time data to optimize routes and reduce congestion.
Integration with Cause of Heterogeneity	Information asymmetry and the strategic role of intermediaries contribute to the observed heterogeneity in entrepreneurial success, growth rates, and innovation levels.	Technology-mediated intermediation mitigates inefficiencies caused by asymmetric information, leading to smoother traffic flow and reduced congestion.

This integrated table elucidates the complex interplay between observation perspectives, the dynamic nature of entrepreneurial activities and traffic flows, and the myriad causes contributing to the observed heterogeneity. It demonstrates how different observational lenses (global vs. local; across time vs. space) influence our understanding of growth and speed in these domains while underscoring the multifaceted factors that contribute to the variability in observed phenomena. This approach provides a nuanced perspective that acknowledges the richness and complexity of analyzing dynamic systems, whether they be in the realm of entrepreneurship or traffic flow analysis.

Mar.26 Q

does the table make sense to you?
is it correct across sector and across firm size are different layers of heterogeneity? How would it affect measurements of growth based on the table above?

raw materials in #214

Prevent Active initial

Implement Dan's DC-prior

Dan wrote a blog on 9/3/2022 on updated version of his previous paper on pc-prior which is practical as users usually don’t have to derive them themselves. Moreover, a lot of that complexity is the price we pay for dealing with densities. We think that this is worth it and the lesson that the parameterisation that you are given may not be the correct parameterisation to use when specifying your prior is an important one! Below are four principles from Dan's blog which includes specific implementation.

Occam’s razor: We have a base model that represents simplicity and we prefer our base model.

Measuring complexity: We define the prior using the square root of the KL divergence between the base model and the more flexible model. The square root ensures that the divergence is on a similar scale to a distance, but we maintain the asymmetry of the divergence as as a feature (not a bug).

Constant penalisation: We use an exponential prior on the distance scale to ensure that our prior mass decreases evenly as we move father away from the base model.

User-defined scaling: We need the user to specify a quantity of interest and a scale . We choose the scaling of the prior so that . This ensures that when we move to a new context, we are able to modify the prior by using the relevant information about .

read Work of Art in the Age of Mechanical Reproduction

curious about authenticity, artistic production as i believe scarcity of textual papers would decrease with LLM
eric uses youtube video to introduce his research
pegging in acadmia
#227

Dynamic aggregation; best cluster may be different from clustered result

🗺️map for startup compass🧭

Discussed in #159

^{Originally posted by hyunjimoon August 22, 2023}
combining Scott Stern and Charlie Fine's research on entrepreneurial strategy and operation. Scott's strategic choice among disruptor, architectural, value chain, IP strategy is a compass🧭and Charlie's phase-based learning is a map 🗺️. Combined they can serve personalized navigation and moreover suggest alternate routes.

Furthermore, early stage scholars have the right to be benefited from this tool. Knowing the existence of different strategy and paths in reaching their goal would encourage them to be more visionary.

nss_gps stands for nail-scale-sail global positioning system and it leverages genAI's information retrieval capability for operation and innovation management. Charlie Fine has been developed nss from early 2000 e.g. https://operations4entrepreneurs.com/.

From below, nss_gps is framed as a tool for startups just as an example. As the need for operation and innovation management (e.g. how and when to segment, processify, acculturate, professionalize, automate, collaborate, capitalize, platformize, replicate, evaluate) is shared by both startups (e.g. Lucid, VinFast, Rivian) and established companies (e.g. Hyundai, Tesla), nss_gps would architecturally support wide range of use-case. More in blueprint below.

product description

establish state space for entrepreneur and provide curated startegy a.k.a. global positioning system, given current and desired state

scott, todd, jb, angie's nsf triplet on genome

Disclaimer: this linearizing feedback loops is outcome of my 🤯 for few months so please understand for some cute assumptions in mapping one action with one or at most testing forms/functions. chain reaction of hierarchy of need-solution-fulfillment triplet (from below) would be cleanest representation.

Belief: scott, todd, jb's module on strategy, PMF, testing interface can be represented with Angie's GATC base

Goal: get three authors' feedback

Mockup:

Action	Subaction	Definition	Gene Therapeutics Industry	Application-Specific Processor Industry
G (Goal)	Aligning belief and goal	How the startup's team aligns its internal beliefs and objectives.	Coordination of team efforts to innovate in gene therapy, aligning with an overarching strategic vision.	Harmonizing internal visions to lead the advancement of AI processing technology in mobile devices.
	Form of Testing	- Mockup: At the goal setting and strategic alignment phase, mockups are useful for exploring various strategic directions without the constraints of actual implementation. They help in visualizing different strategic paths and getting internal consensus. - Prototype: Could also play a role here as a way to concretize strategic goals into tangible objectives, especially for tech startups where early product concepts need to be somewhat tangible to align team goals.	Engaging in stakeholder validation to ensure alignment with strategic goals in the realm of gene therapy.	Conducting strategic validation tests to affirm commitment to the development of optical I/O computing technologies.
	Function of Testing	employee's desirability
	Uncertainty Addressed	Identifying and reducing risks related to the startup's direction and innovation focus using feedback from testing phases.	Clarifying strategic directions and innovation priorities to mitigate internal uncertainties.	Ensuring clarity of strategic vision to maintain alignment with market leadership objectives.
AT (Asset with Supplier)	Aligning belief and goal	Alignment between the startup and its suppliers to close the gap between available assets and those needed for product implementation.	Establishing collaborative relationships with suppliers to secure essential assets for gene therapy solutions.	Strategic partnerships with suppliers to ensure asset provision for state-of-the-art AI processor production.
	Form of Testing	- POC (Proof of Concept): Best suited for this stage, especially when negotiating and aligning with suppliers. A POC can demonstrate the technical feasibility of integrating suppliers’ assets or materials into the startup's product, ensuring that both parties are aligned in capabilities and expectations.	Evaluating supplier relationships and agreements to ascertain their capability in supporting innovative product development.	Assessing supplier agreements to confirm their capacity to meet technological asset requirements.
	Function of Testing	financial viability
	Uncertainty Addressed	Reducing risk by ensuring supplier capabilities and asset availability align with the startup's production needs.	Minimizing supply chain uncertainties related to materials and technologies essential for gene therapies.	Reducing supply risk concerning the quality and availability of components for AI processors.
AC (Asset with Distributor)	Aligning belief and goal	Alignment between the startup and its distributors to ensure assets are utilized effectively to meet product function needs.	Forming strategic alliances with distributors to optimize market penetration for gene therapies.	Aligning with distributors to achieve efficient dissemination of AI processors to manufacturers.
	Form of Testing	- Prototype: Useful in this phase for testing logistical and distribution strategies with actual product versions, albeit early ones. This helps in aligning distribution strategies with the product's physical characteristics and requirements.	Testing and refining distribution strategies to maximize market reach and availability of gene therapies.	Verifying distribution network efficiency to ensure market coverage and supply chain effectiveness for AI processors.
	Function of Testing	operational feasibility
	Uncertainty Addressed	Addressing logistical and market access challenges to ensure product delivery aligns with customer demand.	Tackling market access and logistical uncertainties to streamline the distribution of gene therapies.	Addressing delivery risks and demand fulfillment challenges for global AI processor markets.
T (Technology)	Aligning belief and goal	Alignment between the startup's technological capabilities and their practical application.	Ensuring the startup’s technological expertise is applied effectively to gene therapy treatments.	Aligning the startup’s technical know-how with the practical demands for AI processors in mobile devices.
	Form of Testing	- POC (Proof of Concept): Essential for demonstrating the technological viability of the product. It’s about proving that the technology works as intended and can be developed into a viable product. - Prototype: Following a successful POC, moving to a prototype phase is natural here, incorporating real materials and beginning to address production constraints and technical specifications more closely.	Validating technological feasibility to confirm the applicability of the startup's knowledge to treatment solutions.	Testing technological capabilities to ensure they translate into high-performing AI processors.
	Function of Testing	technical feasibility
	Uncertainty Addressed	Ensuring the technology developed is applicable and meets the practical demands of product and market.	Bridging the gap between technological knowledge and its application in the development of gene therapy products.	Closing the gap between technical knowledge and market requirements for AI processors.
C (Customer)	Aligning belief and goal	Alignment between the startup's product offerings and the customer's needs and expectations.	Aligning the startup’s solutions with patient and healthcare provider needs in gene therapy.	Ensuring the AI processors' features align with the needs of mobile device manufacturers.
	Form of Testing	- MVP (Minimum Viable Product): The MVP is critical at this stage for testing the product in real market conditions with actual customers. It helps in validating the economic viability of the product, gathering feedback on its desirability, and understanding the product-market fit.	Validating customer desirability to ensure gene therapies meet the needs and gain acceptance by healthcare providers.	Engaging in iterative design and evaluation using MVPs to confirm that the processors not only meet technical specifications but also the real-world needs and performance expectations of device manufacturers.
	Function of Testing	customer desirability
	Uncertainty Addressed	Using feedback from MVP testing phases to address and reduce market, technological, and operational uncertainties, ensuring the product aligns with customer needs and market demands.	Reducing doubts about whether the gene therapies will be accepted by patients and adopted by healthcare providers by gathering and analyzing feedback on the product’s effectiveness and usability.	Minimizing market and product desirability uncertainties by demonstrating that the AI processors meet or exceed customer requirements and technological expectations for mobile devices, thereby confirming market fit and demand.

[Scott's four choice and strategy]

Choice of technology, organization, customer, competition are determined by four types of entrepreneurial strategy which exists on the axis of INVESTMENT and ORIENTATION. 2 by 2 combinations of INVESTMENT and ORIENTATION's instances are named ("execute" and "compete") as disruption, ("control" and "compete") as architectural, ("execute" and "collaborate") as value chain, ("control" and "collaborate") as intellectual property strategy. Intellectual property strategy focuses on gaining control of innovations through patents and trademarks, and collaborating to reduce costs and has examples like Harry Potter, gettyimages, xerox, DOLBY, INTELLECTUAL VENTURES, Genetech. Value chain strategy aims to be the preferred partner in a slice of an industry's value chain through strong execution and collaboration and has examples like Foxconn, PayPal, madaket, mattermark, DRIZLY, STRATACOM. Disruption strategy targets underserved segments and uses iteration and learning to expand and has examples like NETFLIX, Zipcar, salesforce, amazon, skype, oDesk. Architectural strategy creates an entirely new value chain by controlling a key resource or interface that coordinates multiple stakeholders to provide new consumer value and has examples like facebook, AngelList, ebay, Ford, Etsy, Dell.

[JB's four experiment tools]

Mockup: exploration without production constraints (for design phase, multiple representations and media)
Prototype: first of a series, including production constraints (real materials, cost limitations, etc)
POC: technical demonstrator (technological viability)
MVP: commercial demonstrator (economical viability)

e.g.📱mockup to visualize your idea, move to a 🤳prototype to get a feel for its physical presence, create a 📲POC to make sure the core feature (charging) works, and finally develop an ⚙️MVP to test the market with a basic but functional product. At each stage, you're learning more and getting closer to a product that people can actually buy.

[Todd's belief formation and product-market fit]

framework in Todd.png. In sum, actors in the market collectively evaluate the entrepreneur’s belief, which the entrepreneur then discovers by testing the belief in the market—a process that ultimately reveals the accuracy of assumptions about market needs, the feasibility of product features, and fit of product features to market needs. We denote the market’s view of the belief as the market model, or, in economic terms, the demand function for the belief, with V(M) 5 Needs (M) 3 Features (M) (right-hand part of Figure 1). 2 The weighting of the needs and feature assumptions underlying V(M) typically differs from that of V(E) because the belief is self-generated by the entrepreneur, resulting in a misfit between V(E) and V(M). Hence, the true value of the opportunity belief (V) is proportional to the fit between V(E) and V(M), with fit being the correlation between V(E) and V(M), what we define as product–market fit. The entrepreneur may thus hypothesize a promising product, based on plausible assumptions about demand and a host of features deemed feasible to satisfy the demand, such that V(E) is high. But, such an idea may still fail to create value when, as is often the case at the outset, the assumptions about demand and features inherent to the belief model fail to fit the revealed market model (low correlation between V(E) and V(M)). The value of an opportunity belief (V) is thus V / correlation (V(E),V(M)).

Viewing the value of an opportunity belief as scaled by its fit to the market is aligned with cognitive sciences and lens models of human judgment, which denote the fit of a judgment as the extent to which the mental representation matches the environment (Brunswik, 1956; Csaszar & LaureiroMart�ınez, 2018; Kozyreva & Hertwig, 2021; Shepherd & Zacharakis, 2002). This view of belief fit is aligned with Todd and Gigerenzer’s (2012) idea of ecological rationality, defined as the fit between a cognitive tool and the environment (Hertwig et al., 2019). Relative to this work, our notion of fit is more problem or belief specific—how well hypothesized solutions solve specific market problems. We thus view product–market fit as the correlation between (a) the entrepreneur’s own projections about the value-creation capacity of assumed needs and associated product features, and (b) the revealed value-creation capacity of these needs and features in the market. We call the former the belief model (Figure 2, left) and the latter the market model of the belief (Figure 2, right).

[need, solution, fulfillment triplet]

Detail in need, solution, fulfillment triplet with summary below:

Imagine data as needed, computational algorithms as fulfillment, statistical model p(theta, y) as need-solution pair. Mapping this with SBC code (architecture) in the second row below teaches us: (a) function (N,S, NSP, F, E) and object (paired need-sol fulfilled need-sol evaluated fulfilled need-sol) can be separated (need-solution pairing function paired need-sol) concepts (b) need, fulfillment, evaluating function themselves can be parallelly developed.

Precedence graph is as follows: The NS() function is the starting point. There are no incoming arrows, which implies it is the first function in the sequence.
From NS(), there are two paths diverging:
One path leads to SNR() function.
The other path leads directly to the A() function.
The SNR() function then sequentially leads to the E() function.
The E() function, in turn, leads to the G() function.
It appears that NS() is a prerequisite for both SNR() and A(), meaning that NS() should be completed before moving to SNR() or A(). After SNR(), the functions must proceed in a linear order from E() to G(). There is no direct connection between A() and any other function, suggesting that A() can be considered a separate or parallel process to the SNR()-E()-G() sequence.

Experiment on prior tail

don't use lognormal for deviation param

neg_binom seem to be less scale-dep

with lognormal and different driving data scale (initial order rate)

distribution

    ### 1) ode parameter prior
    model.set_prior("inventory_adjustment_time", "normal", 2, 0.4)  # heuristic of 1/5
    model.set_prior("minimum_order_processing_time", "normal", 0.05, 0.01)

    #### 2) sampling distribution parameter (measruement error) prior
    model.set_prior("phi", "inv_gamma", 2, 0.1) # mean beta, alpha-1

    ### 3)  measurement \tilde{y}_{1..t} ~ f(\theta, t)_{1..t}

    model.set_prior("work_in_process_inventory_obs", "neg_binomial_2", "work_in_process_inventory", "phi")
    model.set_prior("inventory_obs", "neg_binomial_2", "inventory", "phi")

returns the scale of mean 46, standard error of 5

normal (mu, (mu/5)^2) for parameters' prior (heuristic)

lognormal for measurement distribution (likelihood, similar to neg_binom; except this handles continuous)

inverse-gamma(2, 0.1) for sigma's prior (anything but lognormal; zero and extremity avoiding)

mean becomes 3.8e+27 with inverse-gamma(2, 5) whose mean is 2

Should we keep two separate stan function files for models with different numeric assumtion?

Are two model with different values of assumed parameter use the same stan function file? If not, we should define two model in python file, reinitialize with the same lists of prior distribution which is cumbersome.

The example situation above is asymmetric generator and estimator, process_noise_scale =0 is added in the latter which leads to minute change between two model and function files.

Mismatching time step of driving data and generator model

Inventory management model is weekly-based, but customer order rate data is reported monthly (from 1992-01-01 to 2016-05-01 with its value 10^5 magnitude).

Compare vensim and stanify

S = 30, M = 1000 (vensim and stan would be different but non-autocorrelated, converged sample), N = 20

without process noise, with measurement noise

Compare the average: prior_pred_obs: https://github.com/Data4DM/stanify/blob/main/stanify_demo.py#L47

Use alpha, beta, gamma, delta prior dist defined in draws2data stan file.

We assume stan's rng and vensim's rng is similar. In stan, prey_obs ~ normal_rng(prey, m_noise_scale)

array([ 0.24878059,  0.29612573,  0.55957889,  0.06677449, -2.04914881,
       -0.08381008, -0.82436614,  0.5243477 , -0.60570814, -0.37597002,
       -0.92898971,  0.0248358 ,  0.60356123, -0.75828866,  0.12360369,
        0.18310545, -0.37519184, -0.77519677,  0.07079331,  0.17829731])

Compare prey_obs, predator_obs

estimate for without process noise

with process noise, with measurement noise

Consistency of realized random variable should be matched. Lookup function (CDF) is the right way to go.
make sure np.random.normal(0,1, size=n_t) is used

array([ 0.24878059,  0.29612573,  0.55957889,  0.06677449, -2.04914881,
       -0.08381008, -0.82436614,  0.5243477 , -0.60570814, -0.37597002,
       -0.92898971,  0.0248358 ,  0.60356123, -0.75828866,  0.12360369,
        0.18310545, -0.37519184, -0.77519677,  0.07079331,  0.17829731])

Exclude Vensim binaries and backups from repo

The following Vensim file types should be on the global exclude list:
.2mdl
.3vmfx
Probably a few other types I haven't thought of yet.

Most .vdfx should also be ignored, and instead check in a .tab or .dat version of the data. Not sure .vdfx should be a global exclude item though - sometimes it's a convenience.

Inventory management casestudy

The end goal of this model are three:

online decision
dynamic timestep aggregation
dynamic subscript aggregation

The first casestudy is the fusion of Bayequentist from DailyDigest and ManageChain from MonthlyModel. Three Bayesian checks are applied to chained system where demand/supply and material/information are matched. This is the diagram:

I aim to address first two topics from #5,

^{Originally posted by hyunjimoon September 12, 2022}

Goal

to figure out the best way to tune the following hyper-parameters and its useful resource. I left the link for each discussion (#12, #7, #9, #11) as they need active development, so please leave comment in each link! If you think of any other tunable parmameters, please leave a comment so that I can allocate a new discussion thread for that.

1. Typify model into PA or PAD based on the research purpose in #12

Some models are useful, but how do we know which ones?Towards a unified Bayesian model taxonomy

2. Typify parameters to assumed or assumed time-series or estimated for testing in #7

this discussion on holding constant (clamping) between two leaders (Andrew Gelman and Judea Pearl)

Refactoring stanify: Mismatching initialization stock value

Stanify(.mdl) writes the following as production_start_rate in ode_vensim_function function in function block.

max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 7) 
+ 6 * max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 7) 
- 6 * max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 7) / 3

When wip_adjustement_time = 3, manufacturing_cycle_time = 6, inventory_adjustment_time = 8, time_to_average_order_rate = 8:

max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 8) 
+ 6 * max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 8) 
- 6 * max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 8) / 3;

When wip_adjustement_time = 2, manufacturing_cycle_time = 8, inventory_adjustment_time = 8, time_to_average_order_rate = 8:

max(0,
+ 8 * max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 8) 
- 8 * max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1 / 8) 
      + max(0, 1 + 2 + 2 * 1 - 2 + 2 * 1  / 8) / 2)

Whereas manual computing replacing the following equation with values assigned in vensim (below) returns:

max(0, 
        (+ 8 * max(0, 1 + (+ (2 + 2) * 2 
                           - (2 + 2) * 2
                           )/8) 
         - 8 * max(0, 1 + (+ (2 + 2) * 2 
                           - (2 + 2) * 2
                           )/8) 
        )/8 
+ 
max(0, 
              1 + (+ 1 * (2 + 2) 
                   - 1 * (2 + 2)
                )/8
            )
)

production_start_rate_stock_init is computed using topologically-sorted abstract syntax tree:
INIT: production_start_rate_stock_init = production_start_rate
EQN:

desired_production_start_rate = adj_for_wip + desired_production
adj_for_wip= (desired_wip - work_in_process_inventory)/wip_adjustment_time

= max(0, desired_production_start_rate) 
= max(0, 
               adj_for_wip 
               + desired_production
      ) 

= max(0, 
             (desired_wip - work_in_process_inventory)
             /wip_adjustment_time
            + max(0, customer_order_rate + production_adjustment_from_inventory) 
      )

INIT: work_in_process_inventory_init = desired_wip
EQN: production_adjustment_from_inventory = (desired_inventory - inventory)/inventory_adjustment_time

    = max(0, 
                 (desired_wip - desired_wip)
                 /wip_adjustment_time
                 + max(0, customer_order_rate + `(desired_inventory - inventory)/inventory_adjustment_time`
            ) 
      )

INIT: inventory_init = desired_inventory
EQN:
desired_wip = manufacturing_cycle_time * desired_production
desired_production = MAX(0, expected_order_rate + production_adjustment_from_inventory)

    = max(0, 
        (+ manufacturing_cycle_time * desired_production 
         - manufacturing_cycle_time * desired_production
        )/wip_adjustment_time
        + max(0, 
            customer_order_rate + (desired_inventory - desired_inventory
                   )/inventory_adjustment_time
           )
       )

EQN:
desired_inventory = MAX(0, desired_inventory_coverage + expected_order_rate)

    = max(0, 
        (+ manufacturing_cycle_time * desired_production 
         - manufacturing_cycle_time * desired_production
        )/wip_adjustment_time
        + max(0, 
            customer_order_rate + 
           (max(0, desired_inventory_coverage + expected_order_rate) 
           -  max(0, desired_inventory_coverage + expected_order_rate)
                   )/inventory_adjustment_time
           )
       )

INIT
expected_order_rate_init = customer_order_rate

EQN:
desired_production = MAX(0, expected_order_rate + production_adjustment_from_inventory)
desired_wip_init = manufacturing_cycle_time * desired_production
desired_inventory_init = expected_order_rate * desired_inventory_coverage

    = max(0, 
        (+ manufacturing_cycle_time * max(0, expected_order_rate + production_adjustment_for_inventory) 
          - manufacturing_cycle_time * max(0, expected_order_rate + production_adjustment_from_inventory)
        )/wip_adjustment_time
        + max(0, 
            customer_order_rate + 
           (max(0, `desired_inventory_coverage`+ customer_order_rate) 
           -  max(0, `desired_inventory_coverage` + customer_order_rate)
                   )/inventory_adjustment_time
           )
      )

INIT again from newly created (customer_order_rate always the second term lagging behind the first):
expected_order_rate_init = customer_order_rate
EQN:

desired_inventory_coverage = minimum_order_processing_time + safety stock coverage

    = max(0, 
        (+ manufacturing_cycle_time * max(0, expected_order_rate + (desired_inventory - inventory)/inventory_adjustment_time) 
          - manufacturing_cycle_time * max(0, expected_order_rate + (desired_inventory - inventory)/inventory_adjustment_time)
        )/wip_adjustment_time
        + max(0, 
              customer_order_rate + 
               (max(0, + expected_order_rate * (minimum_order_processing_time + safety_stock_coverage) 
                   - max(0, expected_order_rate * (minimum_order_processing_time + safety_stock_coverage)
                )/inventory_adjustment_time
            )
      )

INIT: expected_order_rate_init = customer_order_rate
EQUATION: customer_order_rate = desired_inventory_coverage * expected_order_rate

    = max(0, 
        (+ manufacturing_cycle_time * max(0, customer_order_rate + (desired_inventory - desired_inventory)/inventory_adjustment_time) 
         - manufacturing_cycle_time * max(0, customer_order_rate + (desired_inventory - desired_inventory)/inventory_adjustment_time)
        )/wip_adjustment_time
        + max(0, 
              customer_order_rate + 
               (max(0,(+ customer_order_rate * (minimum_order_processing_time + safety_stock_coverage) 
                   -  max(0, customer_order_rate * (minimum_order_processing_time + safety_stock_coverage)
                )/inventory_adjustment_time
            )
      )

The following is numeric tracking using the values

customer_order_rate = 1
wip_adjustement_time = 3
minimum_order_processing_time = 5
manufacturing_cycle_time = 6
inventory_adjustment_time = 7
time_to_average_order_rate = 8
safety_stock_coverage = 2

INSERT INIT production_start_rate_stock_init = production_start_rate

= max(0, desired_production_start_rate) 
= max(0, 
               adj_for_wip 
               + desired_production
      ) 

= max(0, 
             (desired_wip - work_in_process_inventory)
             /wip_adjustement_time
            + max(0, customer_order_rate + production_adjustment_from_inventory) 
      )
INSERT INIT work_in_process_inventory_init desired_wip

    = max(0, 
                 (desired_wip - desired_wip)
                 /3
                 + max(0, 1 + (desired_inventory - inventory)/inventory_adjustment_time
            ) 
      )
INSERT INIT inventory_init = desired_inventory

    = max(0, 
        (+ manufacturing_cycle_time * desired_production 
         - manufacturing_cycle_time * desired_production
        )/3
        + max(0, 
            1 + (desired_inventory - desired_inventory
                   )/7
           )
       )
INSERT INIT desired_wip_init = manufacturing_cycle_time * desired_production ,desired_inventory_init = expected_order_rate * desired_inventory_coverage

    = max(0, 
        (+ 6 * max(0, expected_order_rate + adjustment_for_inventory) 
          - 6 * max(0, expected_order_rate + adjustment_for_inventory)
        )/3
        + max(0, 
              1 + (+ expected_order_rate * desired_inventory_coverage 
                      - expected_order_rate * desired_inventory_coverage
                  )/7
            )
      )
INSERT expected_order_rate_init = customer_order_rate = 1

    = max(0, 
        (+ 6 * max(0, 1 + (desired_inventory - inventory)/inventory_adjustment_time) 
         - 6 * max(0, 1 + (desired_inventory - inventory)/inventory_adjustment_time)
        )/3
        + max(0, 
              1 + (+ 1 * (minimum_order_processing_time + safety_stock_coverage) 
                   - 1 * (minimum_order_processing_time + safety_stock_coverage)
                )/7
            )
      )
INSERT INIT

    = max(0, 
        (+ 6 * max(0, 1 + (desired_inventory - desired_inventory)/7) 
         - 6 * max(0, 1 + (desired_inventory - desired_inventory)/7)
        )/3
        + max(0, 
              1 + (+ 1 * (2 + 2) 
                   - 1 * (2 + 2)
                )/7
            )
      )

    = max(0, 
        (+ 6 * max(0, 1 + (+ desired_inventory_coverage * expected_order_rate
                           - desired_inventory_coverage * expected_order_rate
                           )/7) 
         - 6 * max(0, 1 + (+ desired_inventory_coverage * expected_order_rate
                           - desired_inventory_coverage * expected_order_rate
                           )/7) 
        )/3
        + max(0, 
              1 + (+ 1 * (2 + 2) 
                   - 1 * (2 + 2)
                )/7
            )
      )

    = max(0, 
        (+ 6 * max(0, 1 + (+ (minimum_order_processing_time + safety_stock coverage) * 2 
                           - (minimum_order_processing_time + safety_stock coverage) * 2
                           )/7) 
         - 6 * max(0, 1 + (+ (minimum_order_processing_time + safety_stock coverage) * 2 
                           - (minimum_order_processing_time + safety_stock coverage) * 2
                           )/7) 
        )/3 
        + max(0, 
              1 + (+ 1 * (2 + 2) 
                   - 1 * (2 + 2)
                )/7
            )

      = max(0, 
        (+ 6 * max(0, 1 + (+ (2 + 2) * 2 
                           - (2 + 2) * 2
                           )/7) 
         - 6 * max(0, 1 + (+ (2 + 2) * 2 
                           - (2 + 2) * 2
                           )/7) 
        )/3 
        + max(0, 
              1 + (+ 1 * (2 + 2) 
                   - 1 * (2 + 2)
                )/7
            )
        )

Verifying chaos in population dynamics

Based on the rugged posterior space reported by Angie, Tom and Angie will delve into this weird behavior:
After changing the measurement noise from additive to multiplicative, different pattern is observed but initial stock value as (prey, predator) as (30, 4) and the following prior distribution still show problematic geometry. After the image of multiplicative and additive measurement noise formulation, is the two errors I faced in multiplicative setting (which seems to have a terrible posterior space).

model.set_prior("alpha", "normal", 0.55, 0.055, lower = 0)
model.set_prior("beta", "normal", 0.028, 0.0028, lower = 0)
model.set_prior("delta", "normal", 0.024, 0.0024, lower = 0)
model.set_prior("gamma", "normal", 0.8, 0.08, lower = 0)
model.set_prior("m_noise_scale", "normal", 0.1, 0.001, lower = 0)

For instance, I faced two warning from the HMC sampler which proves it is suffering from problematic posterior geometry. The first one is divergent transition which I faced for the first time in SD models. It basically means the conservation of hamiltonian along which vector field flow flows is being violated. The second, I am debugging.

	Chain 1 had 3 divergent transitions (3.0%)
	Chain 2 had 1 divergent transitions (1.0%)
	Chain 4 had 1 divergent transitions (1.0%)

chain 1 |██████████| 00:54 Sampling completed                       
chain 2 |██████████| 00:54 Sampling completed                       
chain 3 |██████████| 00:54 Sampling completed                       
chain 4 |██████████| 00:54 Sampling completed                       
19:30:43 - cmdstanpy - INFO - CmdStan done processing.
19:30:43 - cmdstanpy - WARNING - Non-fatal error during sampling:
Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: normal_lpdf: Location parameter[149] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 50, column 4 to column 43)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45: ode parameters and data is inf, but must be finite! (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
	Exception: ode_rk45:  Failed to integrate to next output time (0.01) in less than max_num_steps steps (in '/Users/hyunjimoon/Dropbox/15879-Fall2022/Homeworks/HW7/stanify/stan_files/prey_predator_nc/prey_predator_nc_data2draws.stan', line 38, column 4 to column 162)
Consider re-running with show_console=True if the above output is unclear!

Use Xarray for testing

Continuing the discussion in #46

I am writing a summary of everything that could have been done better. This project used xarray with extension.nc file (great platform for spatio-temporal data; you can plot graphs below in your computer as below in three minutes by downloading .nc file here. ) links for python file that produces plots will be greatly useful for replication.

Different `estimated parameter` value is used for stock initial value and ode integration

@tomfid and @Dashadower I need your help!
From stan draws2data code below, inventory_adjustment_time = 7 and minimum_order_processing_time = 5 are used to compute the initial values of stocks. These are inputs read from vensim.

    // Initial ODE values
    real inventory__init = 5 + 2 * 100;
    real expected_order_rate__init = 100;
    real work_in_process_inventory__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7);
    real production_rate_stocked__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 6;
    real production_start_rate_stocked__init = fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) + 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) - 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 3;
    real backlog__init = 100 * 2;

However, for ode integration, values generated from user defined prior distribution normal_rng(2, 0.1), normal_rng(0.05, 0.001) is used which is orders of magnitude smaller than 7 and 5.

    real inventory_adjustment_time = normal_rng(2, 0.1);
    real minimum_order_processing_time = normal_rng(0.05, 0.001);
    real m_noise_scale = normal_rng(0.01, 0.0005);

I think this is a serious problem as I've seen different dynamics panning out from minute change of initial values (tipping points? #14). So the questions is, how to sync the two versions of modeler: one initializing parameter values from vensim, the other initializing parameter distribution from stanify. Below is the entire draws2data code.

generated quantities{
    real inventory_adjustment_time = normal_rng(2, 0.1);
    real minimum_order_processing_time = normal_rng(0.05, 0.001);
    real m_noise_scale = normal_rng(0.01, 0.0005);

    // Initial ODE values
    real inventory__init = 5 + 2 * 100;
    real expected_order_rate__init = 100;
    real work_in_process_inventory__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7);
    real production_rate_stocked__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 6;
    real production_start_rate_stocked__init = fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) + 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) - 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 3;
    real backlog__init = 100 * 2;

    vector[6] initial_outcome;  // Initial ODE state vector
    initial_outcome[1] = inventory__init;
    initial_outcome[2] = expected_order_rate__init;
    initial_outcome[3] = work_in_process_inventory__init;
    initial_outcome[4] = production_rate_stocked__init;
    initial_outcome[5] = production_start_rate_stocked__init;
    initial_outcome[6] = backlog__init;

    vector[6] integrated_result[n_t] = ode_rk45(vensim_ode_func, initial_outcome, initial_time, times, minimum_order_processing_time, inventory_adjustment_time);
    array[n_t] real inventory = integrated_result[:, 1];
    array[n_t] real expected_order_rate = integrated_result[:, 2];
    array[n_t] real work_in_process_inventory = integrated_result[:, 3];
    array[n_t] real production_rate_stocked = integrated_result[:, 4];
    array[n_t] real production_start_rate_stocked = integrated_result[:, 5];
    array[n_t] real backlog = integrated_result[:, 6];

    vector[20] production_rate_stocked_obs = to_vector(normal_rng(production_rate_stocked, m_noise_scale));
    vector[20] production_start_rate_stocked_obs = to_vector(normal_rng(production_start_rate_stocked, m_noise_scale));
}

Check stanify implementation is downsampling

real production_start_rate_stocked_change_rate = production_start_rate - production_start_rate_stocked / time_step; from function block of inventory model with process noise: is this evaluated 20 vs 20 * (1/0.0625) times?

Structure to combine posterior samples from different prior draws and compute loglikelihoods

We use S = 30, M = 100, N = 20.

Procedure

1. Generate 30 datasets

Use $\tilde{\alpha} =.8, \tilde{\beta}= .05, \tilde{\gamma} = .8, \tilde{\delta} = .05$ and inject process noise. Four of the thirty sets of Y datasets (size: 2 * 20) look like:

2. Run MCMC for each Y dataset which returns one hundred sets of $\alpha_{1..100}, \beta_{1..100}, \gamma_{1..100}, \delta_{1..100}$ for each $\tilde{Y_s}$. Hundred posterior vectors for S =1 look like:

3. Calculate loglikelihood for given $Y_s$

with each posterior sample pairs 1..M. For instance, with ${SM}$ subscript notation, $\alpha_{11} =.7, \beta_{11} = .06, \gamma_{11} = .8, \delta_{11} = .06$ is the example of SM= 11 vector. Compute loglikelihood 3,000 times which is a function of four parameter values and $Y_s$.

4. Compute rank of loglikelihood within each S

Formula for ranks are: $(\Sigma_{m= 1..M} f(\alpha_m, \beta_m, \gamma_m, \delta_m, Y_s) < f(\tilde{\alpha}, \tilde{\beta}, \tilde{\gamma}, \tilde{\delta}, Y_s)$ . Plot the histogram of this S number of ranks (x-axis range would be 0 to 100).

@tseyanglim, @tomfid I hope the above is more descriptive (which TY requested) :)

comparing old and new movie data

issues

slightly different names

y and i
abbreviation
Dominique A. and Dominique Abel

new is smaller than imdb online

full credit info which is in online (viva, 2001, tvseries 1 episode) is not included in new.tsv
documentary's category is not actor or actress, but self i.e. Scrooge . from Courier Culture (order =1, but far from star)

-- should i include all category? array(['self', 'director', 'cinematographer', 'composer', 'producer', 'editor', 'actor', 'actress', 'writer', 'production_designer', 'archive_footage', 'archive_sound']

tt2236646       video   Courier Culture Courier Culture 0       2012    \N      9       Biography,Documentary,News
bash-3.2$ grep -w "Courier Culture" movie_principals.tsv 
tt7813156       10      nm9522476       self    \N      ["Self - The Courier Culture Editor"]
tt7813156       9       nm9522475       self    \N      ["Self - The Courier Culture Editor"]

statistics

old: 16m (15870224)
new: 20m (20517830)
oldnew_left_merge: 16m (15876865; [can increase if right has duplicate [title_year, primaryName] row](f the right table has two records that match to one record in the left table, it will return two records.))
oldnew_inner_merge: 2.5m

Q. for new, isn't 1:3 for title: title-person too small? (can ~10 casts be small enough, can 1 cast e.g. documentary be large enough, to explain this?)

read Suchman's Plans and Situated Actions or Human–Machine Reconﬁgurations (PSA 2e)

Lucy Suchman - Human-Machine Reconfigurations_ Plans and Situated Actions, 2nd Edition.pdf