svjan5 / medtype Goto Github PK

View Code? Open in Web Editor NEW

116.0 116.0 10.0 1.67 MB

MedType: Improving Medical Entity Linking with Semantic Type Prediction

License: Apache License 2.0

Shell 17.82% Python 78.24% JavaScript 1.20% HTML 1.46% CSS 0.21% Dockerfile 1.07%

bert-as-service biomedical deep-learning entity-linking medical pytorch state-of-the-art

medtype's People

Contributors

Stargazers

Watchers

Forkers

ndobb vsocrates snyas gaurav katia-m0 cmaclaren stonemasonsantiago mr852 airnicco8 abishek85

medtype's Issues

UMLS authentication change

Hi,

One of your linkers (ctakes) has changed the way it connects to UMLS. I thought you would like to know so you update your Read.me

More info here.

medtype-as-service

Hi,
I am running medtype-as-server. While running when I am starting the server using the following command :

medtype-serving-start --model_path $PWD/resources/pretrained_models/pubmed_model.bin \
		      --type_remap_json $PWD/../config/type_remap.json \
		      --type2id_json $PWD/../config/type2id.json \
		      --umls2type_file $PWD/resources/umls2type.pkl \ 
		      --entity_linker scispacy

`
I am getting the following error:

I:VENTILATOR:[__i:__i: 64]:freeze, optimize and export graph, could take a while...
Traceback (most recent call last):
  File "/opt/conda/bin/medtype-serving-start", line 33, in <module>
    sys.exit(load_entry_point('medtype-serving-server==1.0.0', 'console_scripts', 'medtype-serving-start')())
  File "/opt/conda/lib/python3.6/site-packages/medtype_serving_server-1.0.0-py3.6.egg/medtype_serving/server/cli/__init__.py", line 4, in main
    with MedTypeServer(get_run_args()) as server:
  File "/opt/conda/lib/python3.6/site-packages/medtype_serving_server-1.0.0-py3.6.egg/medtype_serving/server/__init__.py", line66, in __init__
    self.model_params   = self.load_model(args.model_path)
  File "/opt/conda/lib/python3.6/site-packages/medtype_serving_server-1.0.0-py3.6.egg/medtype_serving/server/__init__.py", line74, in load_model
    state               = torch.load(model_path, map_location="cpu")
  File "/opt/conda/lib/python3.6/site-packages/torch/serialization.py", line 527, in load
    with _open_zipfile_reader(f) as opened_zipfile:
  File "/opt/conda/lib/python3.6/site-packages/torch/serialization.py", line 224, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:132)

I tried loading the pretrained model using torch version: 1.8.0 and using code :
model = torch.load('pubmed_model.bin', map_location="cpu")
It loaded successfully.
But when I did it using torch version: 1.4.0
I got same error as above.

Is the pretrained model present in the medtype github repo, trained on a different pytorch version? Or is there any other problem? Please let me know the way to solve it.

Thanks

Semantic type prediction results in paper

In table 5, you provide results of MT <- WikiMed & MT <- PubMedDS.
I am curious about the results of MedType trained on WikiMed(or PubMedDS) without fine-tuning in semantic type prediction.

Python package instead of medtype-as-service

I think it'd be really helpful to pull out the entity linkers that are python libraries (all except for cTAKES) and make an python package to be used in notebooks and installed in conda/pip environments easily.

I've made some changes and gotten the scispacy + medtype part to work without the server (if that's helpful), however I haven't checked the other linkers or done any extensive testing. Forked repo: https://github.com/vsocrates/medtype

Update entity_linkers.py

Hi,

There were problems when running scispacy linker v0.2. In particular, there was not a config.cfg file to read. I solved that by installing the v0.4.

However, in the new release there are a few changes with respect to nlp.add_pipe.
I had to manually change your source code file entity_linkers.py to make it run:

pip version?

I'm using pip-21.1.1 to install requirements for server as follow:
pip install -r ./medtype-as-service/server/requirements.txt
Output:

Looking in indexes: https://pypi.org/simple, https://:****@pkgs.dev.azure.com/DevOps-RD/daa07a13-a918-496b-9b75-929313115fba/_packaging/az-artifacts-pypi/pypi/simple/
ERROR: Could not find a version that satisfies the requirement torch==1.4.0 (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2, 1.7.1, 1.8.0, 1.8.1)
ERROR: No matching distribution found for torch==1.4.0

It seems that requirements file does not satisfy by pip-21.1.1. May I ask to let me know you're using which version of pip or how can I solve the above problem?

pretrained model URLs - `general text` and `Electronic Health Records (EHR)` same file?

@svjan5
Thanks for sharing.
Just wondering

pretrained model URLs of general text and Electronic Health Records are linked to the same file? They are in different URLs but downloaded files are same size with same file name (general_model.zip 1185789292 bytes).
Online demo seems is not working for me (Linux 64 Chrome v.87 and Firefox v.78). Which browswers that you know are working?
Thanks

Links to Pre-trained Models Not Working

The links to the pre-trained models do not work. It seems like they have been removed. Is there another place to download the models? Thanks. @svjan5

MedType Demo - not working

When I go to your demo website, it asks me to accept the certificate in a popup and when I click Ok, takes me to https://128.2.204.127:8124/run_linker site but this site fails saying 'This site can't be reached'. I have tried both Chrome and Edge. I am not behind any proxy server. I am trying from my home PC.

Semantic type/category information in PubMedDS and WikiMedDS datasets

Hi there!

I downloaded the datasets included with MedType by running download_datasets.sh. I noticed that some datasets (ncbi.json, medmentions.json) include category information, while others (wikimed.json, pubmed_ds) don't. I couldn't find any documentation for why category information is not included -- I notice that figure 4 from the MedType paper specifically mentions that Semantic Type of the term. Is that information available in these datasets somewhere? If so, could you please document how to access that information in this repository?

Thanks so much!