Code Monkey home page Code Monkey logo

gossis's Introduction

GOSSIS: The Global Open Source Severity of Illness Score

For more information about the GOSSIS consortium, please visit https://gossis.mit.edu/

This repository is focused on the extraction part of the GOSSIS pipeline. It extract consistent concepts from multiple databases. Currently only the ANZICS and eICU-CRD datasets are used to build the GOSSIS-1 model.

For code used to compute GOSSIS predictions, please see: https://github.com/jraffa/rGOSSIS1

Adding variables

  1. Add the info to etc/variable-definitions.yaml
  2. Run python3 variablelist2description.py
  3. Run python3 variablelist2header.py
  4. Add the variable to each load-data.ipynb - sometimes this involves adding it to underlying SQL scripts as well
  5. Re-run everything - if not available, the notebooks should prompt you about it

gossis's People

Contributors

alistairewj avatar jraffa avatar jueurousseau avatar mfabre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gossis's Issues

Decide on icu_admit_source dictionary

Need to agree on a subset of terms for ICU admission source.

Current options:

data_source mapped_to anzics eicu orchestra satiq
Accident & Emergency Accident & Emergency 71713 0 31348 0
Acute Care/Floor Floor 0 3306 0 0
Chest Pain Center Floor 0 280 0 0
Direct Admit Accident & Emergency 0 9931 0 0
Emergency Department Accident & Emergency 0 67266 0 0
Floor Floor 0 15078 0 0
Home-care 0 0 217 0
ICU Other ICU 0 149 0 0
ICU to SDU Other ICU 0 30 0 0
Intermediate care unit 0 0 469 0
Intervention room 0 0 2129 0
OT/Recovery Operating Room / Recovery 141593 0 17319 0
Observation Floor 0 10 0 0
Operating Room Operating Room / Recovery 0 20057 0 0
Other Floor 0 8 0 0
Other Hospital Other Hospital 14292 3475 2401 0
Other Hospital ICU Other ICU 2196 0 0 0
Other ICU Other ICU 0 862 0 0
Other ICU, same Hospital Other ICU 440 0 1105 0
Other unkown 0 0 231 0
PACU Operating Room / Recovery 0 1338 0 0
Recovery Room Operating Room / Recovery 0 6119 0 0
Step-Down Unit (SDU) Floor 0 2901 0 0
Ward Floor 35750 0 4474 0

Add respiratory rate from respiratoryCharting

Right now nurse charting gets 90% of respiratory rates

select
count(pt.patientunitstayid) as num_total
, count(vnc_d1.resprate_max) as rr
, count(vnc_d1.resprate_max)::numeric / count(pt.patientunitstayid) * 100.0 as percentage_complete
from gossis_cohort pt left join gossis_vital_nc_d1 vnc_d1 on pt.patientunitstayid = vnc_d1.patientunitstayid
where excluded=0;

If we add in respiratory rate from respiratoryCharting then we could prob bump this up from 90%. Then have to decide whether it is worth it to add in vitalPeriodic to "clean up" the rest.

Total Sample Sizes / Exclusions for CONSORT

For the paper:

  • Need the total dataset sizes for each data_source, regardless if it was included in the final gossis-data-2018-03-20.csv.gz dataset.
  • Bookkeeping for ANZICS/eICU to where the point of gossis-data-2018-03-20.csv.gz (e.g., pre-2014 admissions, <18, etc).

Merging blood gas information

For ANZICS we only have APACHE blood gases (pH, PaO2, PaCO2, FiO2). For eICU, we can get min/max (~35% completion), or APACHE (~23% completion). Regardless, we'd need to think of a sensible way of merging APACHE worst with min/max value.

Missing /ETC file

The file "apache2-definitions.csv" which is referred to in merge-data.ipynb is missing. Without this file, GOSSIS will not properly be able to run on the dataset.

issues with current data

eICU

  • eICU BMI is all zeros
  • eicu po2 max accidentally coded as min?
  • eicu maximum diasbp is very high OR anzics diasbp is very low
  • eicu missing pao2:fio2 max ratio

anzics

  • anzics minimum diasbp usually lower
  • missing d1 glucose for anzics
  • hco3 anzics units are wrong
  • hct for anzics is missing
  • anzics potassium min/max units are wrong
  • anzics sodium min/max units are wrong
  • temperature source in eicu vs anzics?
  • fio2 present in anzics?
  • apache glucose missing from anzics
  • anzics missing hematocrit_apache
  • anzics sodium_apache units wrong

mimic

  • inr mimic max/min switched?
  • mimic ph_apache is it arterial? has high pHs

MIMIC in general: how to define apache variables if score=0? e.g. messes up temperature

worth comment

  • hgb difference in mimic
  • hct difference in mimic
  • mbp high in anzics

Add ventilated for RR

We should add a covariate which indicates if the patient was ventilated for the APACHE RR.

Standardize ranges for valid data

At the moment we don't have a standardized range for valid data. This can be helpful in removing outliers/junk data present.

Philips suggest the following (based on APACHE).

Variable Low range High range
pH values 6 9
paCO2 1 150
paO2 1 900
Creatinine 0.1 20
Hematocrit 5 100
WBC 0.1 200
BUN 1 225
Sodium 80 200
Bilirubin 0.01 75
Albumin 1 10
Glucose 1 3000
Temperature (Farenheit) 68 109
Temperature (Celsius) 20 43
Heart rate 20 220
Respiratory rate 4 60
Blood pressure - Mean 0 270
Blood pressure - Systolic 0 300
Blood pressure - Diastolic 0 250

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.