Code Monkey home page Code Monkey logo

valx's Introduction

Updates

  1. Upgraded to python 3
  2. Bug fixes
  3. Included example jupyter notebook for easy implementation

Valx

a Python tool to extract and structure numeric lab test comparison statements from text

Objectives: To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. Methods: Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes 7 steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric, unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable - numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. Results: The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 Diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. Conclusions: Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.

Usage

import Valx_core

Clean text by preprocessing

Valx_core.preprocessing (text)

Split eligibility criteria text into inclusion and exclusion sections

Please ignore this step if the text is not clincial trial eligibility criteria text

Valx_core.split_text_inclusion_exclusion (text)

Extract candidates containing numeric features

Valx_core.extract_candidates_numeric(text)

Identify numerical expressions

Identify expressions and formalize them into labels, e.g., "<VML(tag) L(logic, e.g., greater_equal)=X U(unit)=X>value</VML>"

Valx_core.formalize_expressions (candidates[])

Identify variable mentions and map them to names

Valx_core.identify_variable(expression_text, feature_dict_dk, fea_dict_umls)

Associate variable and its related numerical values

Valx_core.associate_variable_values(expression_text)

Context-based validation

Valx_core.context_validation(expressions)

Unit conversion and value normalization

Normalize the unit and their corresponding values

Valx_core.normalization(feature_list, expressions)

Heuristic rule-based validation

Valx_core.context_validation(expressions)

Usage examples

Valx_CTgov.py demostrating how to use the Valx for extracting and structuring certain types of numeric lab test comparison statements from clincial trial eligibility criteria texts using single CPU core.

Valx_CTgov_multiCPUcores.py demostrating how to use the Valx for extracting and structuring certain types of numeric lab test comparison statements from clincial trial eligibility criteria texts using multiple CPU cores

Online Demo

http://columbiaelixr.appspot.com/valx

Versions

V0.9 The stable version with full functionality

V1.0 Add multi-CPU core support, enable set core number easily

V1.1 Separate rules from code to a csv file named as "rules.csv"

V1.2 Separate numeric feature list from code to a csv file named as "numeric_features.csv"

Citation

Tianyong Hao, Hongfang Liu, Chunhua Weng. Valx: A system for extracting and structuring numeric lab test comparison statements from text. Methods of Information in Medicine. Vol. 55: Issue 3, pp. 266-275, 2016 on Pubmed

Contributors

Tianyong Hao

Chengtao Li (new Web user interface with online pattern editing function)

valx's People

Contributors

anirudh-murali avatar tony-hao avatar williamkulp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.