Code Monkey home page Code Monkey logo

event-cui-transfer's Introduction

Predicting Clinical Outcomes Across Changing Electronic Health Record Systems

Presented at KDD 2017 in Halifax, Nova Scotia

August 2017

Jen Gong, Tristan Naumann, Peter Szolovits, and John Guttag

CSAIL, MIT

This repository contains the code for experiments detailed in "Predicting Clinical Outcomes Across Changing Electronic Health Record Systems," presented at KDD 2017 in Halifax, Nova Scotia. We utilized data from MIMIC III, v 1.3.

This code repository has three parts.

  1. Cohort and data extraction: we provide PostgreSQL queries for MIMIC III v 1.3 for the cohort, outcome, and feature extraction in get_data. We assume that several tables provided with the MIMIC installation (e.g., SAPS-II acuity score) have already been built prior to running our code.
  • full_script.sh: Runs relevant SQL queries in the correct sequence to create materialized views for patient cohort extraction.
    • We extract the first ICU stay of all adult (at least 18 yrs old) patients from MIMIC III v 1.3.
    • We extract data for the first 24 hours of each patient's ICU stay, but the functions that are provided in write_functions.sql can easily be extended to extract the same data for other periods during the patient stay.
      • get_data/write_functions.sql: Defines SQL functions for extracting patient identifier, item identifier, value, and time from ICU admission for specified intervals of time.
      • get_data/export_data.sql: Export chartevents, labevents, inputevents, outputevents, services data to CSV.
      • get_data/process_itemid_values.py: Process chartevents Item IDs and values, creating new Item IDs for each unique value containing text that modifies a chart item. This is done because many of the charted items have semantically meaningful values that modify the description of the original item.
    • In addition, full_script.sh creates relevant directories for cTakes annotation and writes to file the descriptions that will be used as input for cTakes.
  1. Bag-of-events feature construction: we provide code that generates the bag-of-events feature representation utilized in the paper. This code contains SQL functions to extract the relevant data from the MIMIC-III data warehouse. Data processing functions for these bag-of-events vectors are also implemented.
  • build_unigram_boe.py: Functions to build BOE vectors for each subject ID.
  • generate_data_pipeline.py: Script to build BOE vectors for patients from each careunit based on events during the first 24 hours of the patient's ICU stay.
  1. MIMIC-III Item ID to UMLS Concept Unique Identifier (CUI) mappings: we pvrovide code that maps the bag-of-events feature representations extracted from MIMIC-III to the UMLS CUI space. This portion of the code requires the user to have a license with UMLS to access the necessary ontologies.
  1. Experiments: setup/prediction code for the experiments detailed in the paper.
  • learn_classifier_db.py: Trains and evaluates model on a single database version (experiments on CareVue alone or MetaVision alone) to demonstrate the performance of BOE features against SAPS-II and to compare the different methods of mapping to CUIs against using EHR-specific Item IDs.
  • learn_classifier_mv_cv.py: Trains a model on one EHR version (either CareVue or Metavision) and tests on the other (Metavision or CareVue). This is done to evaluate portability of a predictive model learned on one EHR and tested on another, using EHR-specific Item IDs and using a shared semantic feature space (UMLS Concept Unique Identifiers, or CUIs).

event-cui-transfer's People

Contributors

jengong avatar

Watchers

Christina avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.