Code for Large-Scale Study of Temporal Shift in Health Insurance Claims. Christina X Ji, Ahmed M Alaa, David Sontag. CHIL, 2023. https://arxiv.org/abs/2305.05087
In replace_zeros_with_null_in_measurement.py there is no version argument as is mentioned in the guide:
To identify measurements with value 0 that should be null and replace them with null, first get a preliminary version of the reference ranges by running python3 replace_zeros_with_null_in_measurement.py --create_table=range_{direction} --version=[int] to produce cdm_measurement_aux.measurement_{direction}_references, a table with the most likely reference range for each measurement concept. Set {direction} to low and high.
Then I added user, password and host to the config file.
Might also be worth refactoring the engine creation and session execution out into a separate function since it's included in multiple functions and is always the same (that I've seen so far)
from utils import session_scope but I believe it should be from utils.dbutils import session_scope
Likewise I needed to modify the import in load_lab_reference.py to the same but I see that you add the directory to the path there. From which directory are you supposed to run the scripts in data_extraction?
The conda_env_pkgs.txt file is pinning specific binary builds of packages which only work on linux. Unfortunately I need to use windows to access a database. Is it possible to include a file with no build specific info and only the direct dependencies of the project ? For example the following command should work:
I think in many cases users only have one writable schema tied to their users. To get this to work in my case I removed cdm_measurement_aux schema and replaced with {scratch_schema}. I then added the scratch_schema to the config file:
scratch_schema = `my_user_name`
Then when formatting the sql file I add this variable. I think this is more in line with situations where users can't create their own schemas but have a fixed one they can write to.