Code Monkey home page Code Monkey logo

argos_nlp's People

Contributors

esilgard avatar whiteau avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

argos_nlp's Issues

add MM classification

create algorithm to classify multiple myeloma risk classification according to the
International Myeloma Working Group (IMWG) Molecular Classification of Multiple Myeloma

http://myeloma.org/ArticlePage.action?articleId=3069

  • Standard
    All others including:
    any trisomy
    t(11;14)(q13;q32)
  • Intermediate
    t(4;14)(p16;q32)
    del 13
    any monosomy
  • High
    del 17p13
    t(14;16)(q32;q23)
    t (14;20)

oncoplex tests

create algorithm to extract oncoplex and other mutational analysis test results from pathology reports

specimen size

algorithm for extraction of specimen size(s) from pathology reports (oneFieldPerSpecimen)
** this may break out into disease specific algorithms **

prognostic risk classification based on cytogenetics

write in new risk classification algorithms for other disease groups (like MDS) based on karyotypes parsed from genetics reports

NOTE
this also may mean breaking out the AML SWOG classification from classify_heme_category.py (depending on how different the salient variations are from other risk stratifications)

fix multiple line offset bug

fix offsets when karyotype ends up on multiple lines
example
ISCN Diagnosis: 44,XX,t(4;16)(q21;q22),del(5)(q13q33),del(7)(q11.2),-12,add(17)(p11.2), -20[18]/46,XX[2]

currently the engine's highlighting the next line down
Summary: POSITIVE for translocation 4;16, deletion 5q, 7q, and 17p, and loss of 12 and 20

fix 'normal' cytogenetics offset bug

fix the character offsets in the 'normal' field (the count of regular XX or XY clones)
(seems to be off by -2)

example:
46,XY,der(21)t(11;21)(q13;q22)[2]/46,XY[18]
currently is highlighting the '6,' before 'XY'

re.search to re.match move

change OneFieldPerSpecimen to a re.match instead of re.search -> driving off text files with regex instead of strings for a smaller set of slightly more complex pattterns

cytogenetics for "all" disease group

write in the ability for the cytogenetics branch to process reports for the "all" (or unknown) disease group parameter
this means either:

  • writing in a disease group classifier for cytology (could be sparse input)
  • and/or writing in some kind of general output for karyotype processing (some small subset of salient mutations or types of abnormalities)

automated output value validation

create hook script to automatically sync/check dropdown field values from metadata.json against data_dictionary options for each algorithm/field

add MDS classification

add algorithm to classify prognosis risk based on the "5-group cytogenetic classification of MDS"
(used in IPSS http://www.mds-foundation.org/ipss-r-calculator/)

Prognosis Single Double Complex
Very Good -Y; del(11q)
Good normal; del(5q); del(20q);del(12p) including del(5q)
Intermediate del(7q); +8; i(17q); +19; any other any
Poor -7; inv(3)/t(3q)/del(3q) Including -7/del(7q) any 3
Very Poor >any 3

cytogenetics duplicate cell count bug

fix bug in cytogenetics parsing that counts similar types of mutations twice
example:
49,XY,der(3)t(3;16)(q12;q22),+8,der(16)inv(16)(p13.1q22)t(3;16)(q12;q22),+21,+22[20]

currently the field '3q' shows a cell count of 40
(adding both the derivation and translocation of (3;16)

tubule formation

pull in algorithm for tubule formation in breast cancer pathology reports (based off oneFieldPerReportML)

clinical branch

bring clinical pipeline from dev into master repo (currently houses prognostic staging algorithm)

fix string cleaning offset bug

fix offset bug produced by stripping off the leading '//' from the karyotype string
(off by -2)
example:
//46,XY[7]

currently highlights '//46,XY[' as the karyotype string

mitotic count

pull in algorithm for mitotic count in breast cancer pathology reports (based off oneFieldPerReportML)

gleason grade

create algorithm to extract gleason grade from prostate cancer pathology reports (oneFiledPerReport)

nottingham grade

pull in algorithm to classify nottingham grade in breast cancer reports (based off oneFieldPerReportML)

lymph involvement

algorithm for extraction of lymph node involvement from pathology reports (oneFieldPerSpecimen)

ML disease group classification

swap out keyword voting algorithm for a machine learning based classification of pathology reports by disease group (based of CSS dataset)

karyotype data structure edit

change cytogenetics data model so that references to former cell lines (sl, sdl2, etc.) will also collect the character offsets of the previous abnormalities. This means a change in data structure so that each abnormality is assigned character offsets. (as opposed to only each clone/cell line).

fix datetime creation

move the datetime creation out of OneFieldPerReport into separate module or class for use by other modules (and other report branches hopefully) and extend to include other potential date string input formats and datetime output formats

PSA algorithm

create algorithm to extract PSA from prostate cancer pathology reports (oneFiledPerReport)

smoking status

algorithm for smoking status classification from clinic notes (oneFieldPerPatient)

Using the NLP Engine for Cytogenetics

Similar to pathology, are there example disease groups for Cytogenetics? Like how pathology has 'brain' and 'head_neck'.

On inspecting process.py there is a comment saying "disease specific cytogenetics data dictionary (no general algorithms currently)", but when I look at data_dictionary.txt I can't see any disease specific data.

Apologies in advance, I'm new to the area of health and don't understand most medical terms so I may be missing something basic. Could you please provide a sample usage with sample disease groups for Cytogenetics?

eTOH

algorithm for drinking status classification from clinic notes (oneFieldPerPatient)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.