Code Monkey home page Code Monkey logo

tuva's Introduction

Apache License dbt logo and version

diagram

🧰 What is the Tuva Project?

The Tuva Project code base includes a core data model, data marts, terminology sets, and data quality tests for doing healthcare analytics.

Explore the project:

*Note: In many cases the actual terminology sets are too large to maintain on GitHub, so we host them in a public AWS S3 bucket. Executing dbt build will load the terminology sets from S3.

Check out our Quickstart guide here.

🔌  Supported Data Warehouses and dbt Versions

  • BigQuery
  • Databricks (community supported)
  • DuckDB (community supported)
  • Redshift
  • Snowflake

This package supports dbt version 1.3.x or higher.

🙋🏻‍♀️ Contributing

We created the Tuva Project to be a place where healthcare data practitioners can share their knowledge about doing healthcare analytics. If you have ideas for improvements or find bugs, we highly encourage and welcome feedback! Feel free to create an issue or ping us on Slack.

Check out our Contribution guide here.

🤝 Community

Join our growing community of healthcare data people in Slack!

tuva's People

Contributors

aneiderhiser avatar cnolanminich avatar cocozuloaga avatar deepson-maitri avatar deepsonshrestha avatar donaldrauscher avatar eldon-tuva avatar krishfhc avatar msolnit avatar nrichards17 avatar sarah-tuva avatar thutuva avatar tom-tuva avatar tuvaforrest avatar utsavpaudel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

tuva's Issues

Build new model CMS-HCC-V28 (payment year 2024)

  • Update seed and mapping files with new values
  • Update logic in int_hcc_mapping model to use new V28 flag column (needs a macro)
  • Add logic to create the blended risk score

Note: this list may not be complete yet.

Update ICD-10-PCS terminology file

The file we have now contains some codes that are three digits in length (e.g. "02N","Heart and Great Vessels, Release","Heart and Great Vessels, Release"). I don't believe these should be here. The CMS file does not contains these and procedure codes should be 7 digits in length.

Add logic for institutional status to CMS HCC

The CMS HCC model includes a risk segment for "institutional status". This status comes from the LTI Flag on the MMR (monthly membership report). Basic payer eligibility data does not include this status.

From CMS Medicare Managed Care Manual:

To determine a beneficiary’s LTI status for payment purposes, CMS uses the reporting of
a 90-day assessment. This information is collected routinely from nursing homes, which
report to the States and CMS on at least a quarterly basis. This data is stored in the
Minimum Data Set (MDS). Payment at the long-term rate starts in the month following
the assessment date. Once persons are identified, they remain in long-term status until
discharged to the community for more than fourteen days. The costs of the short term
institutionalized (less than 90 days) are recognized in the community model.

Currently, we are using a default of No for institutional status. Add logic to the eligibility prep model to calculate this status.

Remove Old Packages from dbt Hub

We need to update dbt Hub so that only the packages we want users to adopt are on it. The only packages we want users to adopt are the following:

  • the_tuva_project
  • the_tuva_project_demo
  • medicare_cclf_connector
  • medicare_lds_connector

However dbt Hub currently lists a bunch of old packages:

Image

Update CMS Chronic Conditions mart with new CMS version

CMS updated the chronic condition algorithms In February. CMS added diagnosis codes for Anemia, Diabetes, Rheumatoid Arthritis/Osteoarthritis, and OUD. CMS also added NDC codes for OUD.

Updated mapping docs added to Google Drive

To Do

  • Update the codes list for the above-listed chronic conditions.

Open Questions

  1. Do we want to implement CMS's stricter logic of filtering claims?

Acute Inpatient Data Mart

We want to refactor the existing encounter grouper data mart into an acute inpatient data mart that does really interesting analytics around acute inpatient.

Add Staging Layer to Every Data Mart

We're adding a staging layer to every data mart to make it more obvious what models a user needs to create before they can run a specific data mart.

We're not adding a staging layer to core, data profiling, encounter grouper, or claim date profiling at this time.

The staging layer is a folder under the data mart folder in models called staging. The staging models merely select the columns needed for the data mart. Right now we are not doing any casting, filtering, or column renaming. We may do this in the future.

The naming convention for the staging models is "data mart__stg_model name" where model name is the name of the model that feeds the stage.

Add Service Categories to PMPM

Currently PMPM only has limited service categories. Add all level 1 and level 2 service categories to PMPM to support financial analytics.

Add OREC to Claims and Core data models

Update the claims data model to include two new fields in the Eligibility model.

  • orec (Original Reason for Entitlement Code)
  • plan_id (Plan specific contract number, group number, etc.)

The following packages/projects will need to be updated to include these new fields.

  • the_tuva_project:
    • integration_tests/Eligibility - source yml and claims data model definitions (this is what controls the data dictionaries on the website)
    • core__eligibility
    • data_profiling__eligibility_missing_values
  • medicare_lds_connector (map OREC from the MBSF)
  • medicare_cclf_connector (map BENE_ORGNL_ENTLMT_RSN_CD from CCLF8 File)
  • fhir_connector
  • the_tuva_project_demo (seeds)
  • CI Testing datasets

Update Terminology Catalog

  • Several terminology sets are in the catalog but do not actually exist on the website. This includes: RBCS, CCSR, SNOMED-CT, LOINC, ATC, NDC, RxNorm, Medicare Specialty, and NUCC Taxonomy. When a user clicks one of the links for these in the catalog the user is re-directed back to the catalog page.
  • Maintainer is not filled out for demographic and geographic terminology sets.
  • Last updated dates are mostly blank
  • I would also love to add a search bar to the catalog if that's not too difficult

Refactor Tuva Repo

  • Data Marts: Transfer SQL and value sets to mono-repo for Readmissions, Chronic Conditions, and PMPM (include service category value set)
  • Configuration: Update configuration by using refs and setting schemas in models.yml and remove all variables that make it confusing.
  • Terminology: Any updates necessary to get terminology working. What about S3 API keys??
  • README: Update
  • CI/CD: Update
  • Rename Repo to Tuva
  • Make individual repos private

Fix Service Category Assignment Issue for Institutional Claims

Institutional claims typically only have paid amounts at the header level. This means in the Tuva claims data model paid amounts will typically only exist on 1 line of an institutional claim. Currently service categories are assigned at the claim line level for institutional claims. Often the paid amounts associated with the service category will exist on a different claim line then the one assigned. Thus the current grouper logic excludes a large amount of paid amounts from being grouped into the proper service category.

CMS HCC Feature request: ability to calculate scores per data source and across data sources

From [email protected] via Slack:

On the third use case, this might be more specific to value based care organizations, but generally speaking if you're contracting with a payer, they want to know what you're contributing to their RAF. You'll also generally be receiving a regular claim feed from the payer for all claims for your attributed population. No papers I'm aware of - this is just from building out risk adjustment at VBC orgs

  • We do this by doing two calculation:
  • Calculate RAF including payer claims + our claims
  • Calculate RAF with just payer claims (excluding our claims)
  • Take the delta of the two, and that is your "unique RAF" contribution.

Basically you use this to make the case to the payer of the value you are creating.

Add the ED classification Mart

This has already been implemented as a separate repo here:

However the code should be modified to run off of the Core data model (i.e. encounter and condition rather than medical_claim).

Add logic for C-SNP enrollees to CMS HCC

The CMS HCC model includes a risk segment for new beneficiaries enrolled in "C-SNP" (Chronic Condition Special Needs Plan). We may be able to use the Chronic Conditions Mart to determine which new enrolled may fall into this status.

For a list of the conditions covered by the special needs plan, refer to Table 2-3 of this report.

Instruct users to not use eligibility or medical claim in dbt_project.yml

In a connector, the dbt_project.yml cannot contain a var called eligibility or medical claim if it points to the source data. This will create a conflict with claims preprocessing. The ReadMe should be updated to reflect this.

Personal example:
In claims_preprocessing staging, there is a {{ var('eligibility')}} and {{ var('medical_claim')}}' reference in the model. In the dbt_project.yml of my connector under vars, I had eligibility: "{{ source(var('source'),'eligibility') }}". This resulted in {{ var('eligibility')}}` being liked to my source eligibility table and not final eligibility table I created for mapping.

Add CCSR Data Mart to Tuva

  • Transfer code and value sets to Tuva repo
  • Test on BigQuery and Redshift
  • If there is preprocessing for value sets, where should these live?
  • KB updates:
    • update diagram
    • add page under Data Marts

Fix CCSR Issue in Current Release

CCSR was never tested on BQ or RS. Aaron added it to the Tuva Project and merged it to main after mistakenly thinking all tests passed. As a result, the Tuva Project currently does not run on BQ or RS.

Enhancement requests for FIPS terminology

Enhancement requests for terminology__fips_county. This would help with visualizations.

  • Split out the state fips code into a separate column
  • Add the full name for the state so that we can merge in datasets by the full state name
  • Add another FIPS file for zip code to county crosswalk

Release 3/29/23: Website 2.0

Definition of done:

  • Set up section complete - Thu Xuan
  • Setting up the Tuva Project - write up and video
  • Setting up Just Terminology - write up and video
  • Data Model section complete - Forrest
  • populate data dictionaries from YAML files
  • Terminology and Value sets section complete - Forrest
  • populate data dictionaries from YAML files
  • render seed files (small files only)
  • SQL scripts to load data from S3 into Snowflake, Redshift, and BigQuery
  • data shares available for Snowflake, Redshift, and BigQuery
  • i.e. any delivery mechanism
  • Not included in this release: new terminology data sets, databricks support (for Terminology SQL), review of every terminology set, maintenance process
  • Claims Data section complete - Aaron
  • Review and update sections on claims data as needed
  • Complete and release current code work on claims preprocessing; update website - Coco
  • Measures and Groupers section complete - Aaron
  • Review and update sections on released measures and groupers as needed
  • Announcement - Aaron
  • Tuva Slack - Announcements channel
  • Actual Release - Aaron and Forrest

Add unit tests and integration testing to the CMS HCC mart

The CMS HCC mart has been manually tested with various methods (e.g. manually calculated scores for a random sample of patients, ran the HCCpy test patients through and compared results). We need to build in better unit/integration testing to ensure that future changes continue to produce the expected results.

  • Unit tests - need to test smaller units of logic and mapping
  • Integration tests - full integration testing with a curated list of mock patients to test various risk factors (this may be complicated by the mart being added to the Tuva Project mono-repo)

Upgrade CI Testing

Convert to using Github Actions as opposed to dbt cloud (jobs can run in parallel, report success/failure of each environment individually, and not tied to one specific dbt cloud account).

  • Create one job for Snowflake
  • Create one job for Redshift
  • Create one job for BigQuery
  • Discuss with team creating jobs for each version of dbt for each environment

Adding CMS-HCC Data Mart to Tuva Project

  • Need to come up w/ analytics story for this data mart. Trending risk over time? Identifying under-coded patients?
  • Create summary data tables that tell the analytics story and add them to the data mart

Add inpatient/outpatient logic to CCSR

When the latest Tuva Core Data Model changes are released with Claims Preprocessing, add logic to map categories for inpatient vs outpatient based on encounter_type.

Add BETOS to Terminology

  • Add dataset to S3
  • Update SQL scripts for direct download
  • Update Tuva repo so it loads this dataset
  • Update Catalog

primary key test on member_months table

The current primary key test for the member_months table is patient_id and year_month. However, this table includes information from all payors so the current primary key is not at the right grain. I think payer and payer_type should be added.

Update Readmission mart with new CMS version

CMS released updated versions of all three readmission measures on 5/3/23. The hospital-wide measure (HWR) updates include updated ICD-10 codes used in the measure.

Updated mapping docs added to Google Drive.

To Do

  • Update the Readmission Mart with the hospital-wide measure changes.

Open Questions

  1. Do we want to maintain the previous version implemented and give the option to run either version or should we just update the measure to the latest version?
  2. Do we want to add the other measures, "Condition-Specific" and "Procedure-Specific"? Currently, only "Hospital-Wide" has been implemented.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.