clinicalml / embeddings Goto Github PK

View Code? Open in Web Editor NEW

241.0 241.0 77.0 138.87 MB

Code for AMIA CRI 2016 paper "Learning Low-Dimensional Representations of Medical Concepts"

Home Page: http://cs.nyu.edu/~dsontag/papers/ChoiChiuSontag_AMIA_CRI16.pdf

License: MIT License

Jupyter Notebook 7.40% Python 92.60%

embeddings's People

Contributors

Stargazers

Watchers

Forkers

sigmaquan esmason sds-dubois spencercarter stevelizcano louislbc amino-data yuqi92 auriml tavpritesh hammer-wang linpingchuan snegi26 codeaudit cometyang ruiatelsevier monkeyconan arnaudmkonan afcarl henghuiz-zz bigodatamining m3ngineer yazdavar patrickmcguinness meizhiju xiangyue9607 youcc yinchangchang thierryherrmann chocolocked littleredhat semanticbeeng 53x yewang87 subhap15 jaredfor mac-kim eparst asapp-h wesleytao automancursor josiahmwalton tonydeep intuitionmachine ryannetwork gumpfly manasrk sidney1994 yd-yang vishalpallagani clairehw drewwilimitis dhyoon0527 blario jaredhuling jbdatascience jplasser yoken-mao badesairam ndobb alexkamil adahuixu mshapi2 stefansorgqc animesh tony-hanseok maigva j4m355 lamawmouk niki-pandata ranajafari lcagnina jzl429 sdcharle lwarrenburg abdullahdmc

embeddings's Issues

Additional Information on Claims Dataset

For explainability purposes some recent papers have shown that stacking embeddings learned from claims/EHRs of different care settings can be valuable. Intuitively this also makes sense as the distribution and co-occurrence of diagnoses codes is different in the hospital versus the clinic. As such, can you provide additional information on the claims dataset used to learn these embeddings including the following:

Were these claims for commercial insurance or from public insurers (e.g., Medicare/Medicaid)?
Were these claims from a mix of care settings, inpatient only, or outpatient only?

CUIs in stanford_cuis_svd_300.txt

The first column in stanford_cuis_svd_300.txt contains numbers (e.g. 4411984) which don't seem to be CUIs (CUIs typically begin with character 'C'). Is there a way to map these numbers to CUIs?

The first column in DeVine_etal_200.txt does look like it contains legitimate CUIs (e.g. C0030705).

KeyError in Embedding_Evaluation.ipynb [3]

KeyError: 'C0002962'.

about claims_codes_hs_300.txt.gz

Hi,

I am confused about the codes in the IDX_IPR_C_N_L_month_ALL_MEMBERS_fold1_s300_w20_ss5_hs_thr12.txt file inside claims_codes_hs_300.txt.gz. Do they refer to icd-9 codes? But when I tried to find a particular icd-9 code in there, it didn't match.

Thanks.

clinicalml / embeddings Goto Github PK

embeddings's People

Contributors

Stargazers

Watchers

Forkers

embeddings's Issues

Additional Information on Claims Dataset

CUIs in stanford_cuis_svd_300.txt

KeyError in Embedding_Evaluation.ipynb [3]

about claims_codes_hs_300.txt.gz

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent