Code Monkey home page Code Monkey logo

ewiser's Introduction

EWISER (Enhanced WSD Integrating Synset Embeddings and Relations)

This repo hosts the code necessary to reproduce the results of our ACL 2020 paper, Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information, by Michele Bevilacqua and Roberto Navigli, which you can read on ACL Anthology.

You will also find a simple spacy plugin that makes it easy to use EWISER in your own project!

EWISER relies on the fairseq library.

NEWS:

Check out the Multilingual section below!

How to Cite

@inproceedings{bevilacqua-navigli-2020-breaking,
    title = "Breaking Through the 80{\%} Glass Ceiling: {R}aising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information",
    author = "Bevilacqua, Michele  and Navigli, Roberto",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.255",
    pages = "2854--2864",
    abstract = "Neural architectures are the current state of the art in Word Sense Disambiguation (WSD). However, they make limited use of the vast amount of relational information encoded in Lexical Knowledge Bases (LKB). We present Enhanced WSD Integrating Synset Embeddings and Relations (EWISER), a neural supervised architecture that is able to tap into this wealth of knowledge by embedding information from the LKB graph within the neural architecture, and to exploit pretrained synset embeddings, enabling the network to predict synsets that are not in the training set. As a result, we set a new state of the art on almost all the evaluation settings considered, also breaking through, for the first time, the 80{\%} ceiling on the concatenation of all the standard all-words English WSD evaluation benchmarks. On multilingual all-words WSD, we report state-of-the-art results by training on nothing but English.",
}

Installation

It is recommended to create a fresh conda env to use ewiser (e.g. conda create -n ewiser python=3.7 pip; conda activate ewiser).

You'll also need pytorch 1.6, and torch_sparse. Assuming you use CUDA 10.1:

conda install pytorch=1.6.0 torchvision cudatoolkit=10.1 -c pytorch
pip install torch-scatter torch-sparse -f https://pytorch-geometric.com/whl/torch-1.6.0+cu101.html

Clone this repo, install the other dependencies, and then ewiser as well.

git clone https://github.com/SapienzaNLP/ewiser.git
cd ewiser
pip install -r requirements.txt
pip install -e .

Now you are ready to start!

Externally downloadable resources

EWISER English checkpoints:

EWISER multilingual checkpoints:

Datasets:

  • WSD Evaluation Framework: contains the SemCor training corpus, along with the evaluation datasets from Senseval and SemEval.
  • Multilingual Evaluation Datasets: the repo contains the French, German, Italian and Spanish datasets from SemEval 2013 and 2015.
  • The other datasets used are in res/corpora/*/orig.

Pre-preprocessed SensEmBERT + LMMS embeddings (needed to train your own EWISER model):

Multilinguality

EWISER supports all the languages for which you are able to create a mapping starting from BabelNet indices 4.0.1.

French/German/Italian/Spanish

  1. Download the BabelNet indices (ver. 4.0.1);
  2. cd multilinguality;
  3. Set your BabelNet indices path in multilinguality/config/babelnet.var.properties;
  4. bash enable.sh. The mapping is limited to the Princeton WordNet subgraph (so you need to use the wn split if you plan to evaluate on mwsd-datasets).

Other languages

Please download the multilingual mapper from Google Drive and find the instructions contained there.

Evaluate

Evaluation is run using bin/eval_wsd.py:

# Download the WSD framework
# wget -c http://lcl.uniroma1.it/wsdeval/data/WSD_Evaluation_Framework.zip -P res
# unzip
# WSD_FRAMEWORK=res/WSD_Evaluation_Framework

python bin/eval_wsd --checkpoints <your_checkpoint.pt> --xmls ${WSD_FRAMEWORK}/Evaluation_Datasets/ALL/ALL.data.xml ${WSD_FRAMEWORK}/Evaluation_Datasets/semeval2007/semeval2007.data.xml

Spacy plugin

EWISER can be used as a spacy plugin. Please check bin/annotate.py.

Train Your Model

Experiment Folder and Preprocessing

To train a model from scratch, you need to set up an experiment folder containing:

  • the dict.txt file from res/dictionaries/dict.txt
  • the preprocessed training corpora with name train, train1, train2 etc.
  • the preprocessed validation dataset with name valid.

We have included our experiment directories in res/experiments/.

Should you need to preprocess your own corpus, you can use bin/preprocess_wsd.py (check out python bin/preprocess_wsd.py --help)!

Training

To launch a training run, execute:

cd bin
bash bin/train-ewiser.sh

This will train EWISER on SemCor + tagged glosses + WordNet Examples. It assumes you have downloaded the LMMS+SensEmBERT embeddings and put them in res/embeddings/.

You can modify hyperparameters or change the training corpora by modifyng train-ewiser.py. Arguments are documented in ewiser/fairseq_ext/models/sequence_tagging.py.

Training Resources

Sense Embeddings

If you want to use your own sense embeddings in EWISER, you have to preprocess them as follows:

python bin/get_centroids.py ${EMBEDDINGS} ${EMBEDDINGS}.centroids.txt bin/sensekeys2offsets.txt
python bin/reduce_dims.py ${EMBEDDINGS}.centroids.txt ${EMBEDDINGS}.centroids.svd512.txt -d 512

The sense embeddings will have to be in Glove .txt format, without a header row, and with a WN 3.0 sensekey as identifiers.

Edges

The adjacency matrix A in EWISER is stored as an edgelist. Each line is an edge, with three \t-separated values. Check res/edges/ for examples.

License

This project is released under the CC-BY-NC-SA 4.0 license (see LICENSE.txt). If you use EWISER, please put a link to this repo.

Acknowledgements

The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union's Horizon 2020 research and innovation programme.

This work was supported in part by the MIUR under the grant "Dipartimenti di eccellenza 2018-2022" of the Department of Computer Science of the Sapienza University of Rome.

ewiser's People

Contributors

cgravier avatar mbevila avatar navigli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ewiser's Issues

Babelnet 4.0.1 Dependency

Hello,

It seems that disambiguation for languages other than English requires Babelnet 4.0.1, which is not available anymore. We tried just replacing the BabelNet 4 references in the bash scripts with those to Babelnet 5.0, but this does not work and producing the offset/lemmapos dictionary files from a newer version seems likely to introduce errors from altered offsets.
Download links for the old API from other locations are all dead.

Do you have plans to update this to BabelNet 5.0, or to make 4.0.1 available for this project?
How else would you suggest using the pretrained Multilingual model for other languages?

Thanks

Installation issue with torch-scatter

Hello,

I am trying the exact same commands specified in the README but I run into an issue in the step
pip install torch-scatter torch-sparse -f https://pytorch-geometric.com/whl/torch-1.6.0+cu102.html

I am getting the following error:
/anaconda2/envs/ewiser/lib/python3.7/site-packages/torch/include/ATen/core/interned_strings.h:415:1: note: in expansion of macro ‘FORALL_NS_SYMBOLS’
FORALL_NS_SYMBOLS(DEFINE_SYMBOL)
^~~~~~~~~~~~~~~~~
error: command 'gcc' failed with exit status 1
----------------------------------------
ERROR: Command errored out with exit status 1: anaconda2/envs/ewiser/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-kouq5f1x/torch-sparse_cd2910acb4354bf6868ba6699efa7d35/setup.py'"'"'; file='"'"'/tmp/pip-install-kouq5f1x/torch-sparse_cd2910acb4354bf6868ba6699efa7d35/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f

I looked online and the solution mentioned was using python binaries (the same command as above) but that doesn't seem to be working, could you please help?

Spacy plugin broken by change in Spacy 3.0

Hi,

When trying to use the spacy plugin, I get:

python disambiguate.py /home/ubuntu/src/ewiser/ewiser.semcor_base.pt
Traceback (most recent call last):
  File "disambiguate.py", line 326, in <module>
    nlp.add_pipe(wsd, last=True)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/spacy/language.py", line 749, in add_pipe
    raise ValueError(err)
ValueError: [E966] `nlp.add_pipe` now takes the string name of the registered component factory, not a callable component. Expected string, but got Disambiguator(
  (model): LinearTaggerModel(
    (embedder): BERTEmbedder(
      (bert_model): BertModel(
        (embeddings): BertEmbeddings(
          (word_embeddings): Embedding(28996, 1024, padding_idx=0)
          (position_embeddings): Embedding(512, 1024)
          (token_type_embeddings): Embedding(2, 1024)
          (LayerNorm): BertLayerNorm()
          (dropout): Dropout(p=0.1, inplace=False)
        )
        (encoder): BertEncoder(
          (layer): ModuleList(
            (0): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (1): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (2): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (3): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (4): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (5): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (6): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (7): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (8): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (9): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (10): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (11): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (12): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (13): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (14): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (15): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (16): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (17): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (18): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (19): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (20): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (21): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (22): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (23): BertLayer(
              (attention): BertAttention(
                (self): BertSelfAttention(
                  (query): Linear(in_features=1024, out_features=1024, bias=True)
                  (key): Linear(in_features=1024, out_features=1024, bias=True)
                  (value): Linear(in_features=1024, out_features=1024, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): BertSelfOutput(
                  (dense): Linear(in_features=1024, out_features=1024, bias=True)
                  (LayerNorm): BertLayerNorm()
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): BertIntermediate(
                (dense): Linear(in_features=1024, out_features=4096, bias=True)
              )
              (output): BertOutput(
                (dense): Linear(in_features=4096, out_features=1024, bias=True)
                (LayerNorm): BertLayerNorm()
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
          )
        )
        (pooler): BertPooler(
          (dense): Linear(in_features=1024, out_features=1024, bias=True)
          (activation): Tanh()
        )
      )
    )
    (decoder): LinearDecoder(
      (embed_tokens): FakeInput()
      (linears): ModuleList(
        (0): Linear(in_features=1024, out_features=512, bias=True)
      )
      (dropout): Dropout(p=0.2, inplace=False)
      (logits): Linear(in_features=512, out_features=117664, bias=False)
      (structured_logits): StructuredLogits(
        (adjacency_pars): ParameterList(
            (0): Parameter containing: [torch.LongTensor of size 2x174717]
            (1): Parameter containing: [torch.FloatTensor of size 174717]
            (2): Parameter containing: [torch.LongTensor of size 2]
        )
      )
      (norm): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
) (name: 'None').

- If you created your component with `nlp.create_pipe('name')`: remove nlp.create_pipe and call `nlp.add_pipe('name')` instead.

- If you passed in a component like `TextCategorizer()`: call `nlp.add_pipe` with the string name instead, e.g. `nlp.add_pipe('textcat')`.

- If you're using a custom component: Add the decorator `@Language.component` (for function components) or `@Language.factory` (for class components / factories) to your custom component and assign it a name, e.g. `@Language.component('your_name')`. You can then run `nlp.add_pipe('your_name')` to add it to the pipeline.

Which is similar to an issue I've recently seen with another system. Spacy seems to have changed how these custom pipelines work.

Any chance of a fix in the near future? I can try to fix it myself but it would be nice to have a solution from the authors. Presumably, if my diagnosis is correct, you will have a lot of people asking about this soon.

Thanks,
Alan

Installation issue

Hi,

I had some problems with

pip install torch-scatter==latest+${CUDA} torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-1.5.0.html

since "latest" tag seems to be no longer supported cf. this issue.
You can cope with this issue with the following commands in which the torch version is explicitly asserted

pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html

multilingual datasets and model

Hi, nice work there. Could you please detail the multilingual dataset version ('all' or 'wn') and also the multilingual BERT version (base or large)?

Memory Issue while running model as REST service

I'm using a REST service to run WSD model. But it's RAM keeps on increasing as I the number of hits increase. I was testing how much RAM will be enough for this, so at last I used a 64GB server, and it consumed all of it.

This is the code for rest service:-

import requests
import re
import os
import time
import json
from flask import jsonify
from time import sleep
from json import dumps
from flask import Flask, request
from ewiser.spacy.disambiguate import Disambiguator
import spacy

nlp = spacy.load("en_core_web_sm", disable=['parser', 'ner'])
wsd = Disambiguator("/content/ewiser.semcor+wngt.pt", lang="en")
nlp.add_pipe(wsd, last=True)
app = Flask(name)
@app.route("/wsd", methods=['POST'])
def wsd():

print('service ---------------(( + _ + ))----------------- started')
item = {}
x = request.get_json()
print('+++++++++++++++++++++++++++++++++', x)
sent = x['text'] 
doc = nlp(sent)
List = []
for w in doc:
	if w._.offset:
		print(w.text, w.lemma_, w.pos_, w._.offset, w._.synset.definition())
		itm= {}
		itm['lemma']=w.lemma_
		itm['synset']=w._.offset
		itm['gloss']=w._.synset.definition()
		List.append(itm)
print(List) 
return jsonify(List)

if name == "main":
#app.debug = True
app.run(host='localhost', port=7777)

code

Hello! Could you please share the EWISER code soon? I'm reading your paper now and it'd be great if I could reference the code as well. Thank you!

Terminate called after throwing an instance of 'std::bad_alloc'

I followed the installation instructions to the dot - including the torch problem described in the first issue but I get the following error when running regardless of the checkpoint issue

python /home/dan/ewiser/bin/annotate.py -c /home/dan/ewiser/res/downloaded/ewiser.semcor_base.pt test.txt

......
2022-07-13 11:58:22 | INFO | pytorch_pretrained_bert.modeling | Model config {
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "max_position_embeddings": 512,
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 28996
}

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

Any ideas?

Loading checkpoints

How does one load the pre-trained checkpoints?

I tried it with embs = torch.load('ewiser.semcor+wngt.pt') but it gives me an error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-123918dcfdcf> in <module>
----> 1 embs = torch.load('ewiser.semcor+wngt.pt', map_location = 'cpu')

~/miniconda3/lib/python3.8/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    606                     return torch.jit.load(opened_file)
    607                 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
--> 608         return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
    609
    610

~/miniconda3/lib/python3.8/site-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
    785     unpickler = pickle_module.Unpickler(f, **pickle_load_args)
    786     unpickler.persistent_load = persistent_load
--> 787     result = unpickler.load()
    788
    789     deserialized_storage_keys = pickle_module.load(f, **pickle_load_args)

ModuleNotFoundError: No module named 'qbert'

Spacy plugin

Is the spaCy plugin working? The link in the README directs to spaCy homepage.
I am asking because I wanted to use your code in my project for WSD.

Notebooks?

The README says "EWISER can be used as a spacy plugin. Please check annotate.py and notebooks/inspect_wsd.py". However, there doesn't seem to be any notebooks/ directory. It looks like it is in the .gitignore file. Can you either add that directory to the repo or delete the mention of it from the README?

running issue with fairseq criterion

I got a criterion issue with the class named WeightedCrossEntropyCriterionWithSoftmaxMask, It is implemented in the ewiser\fairseq_ext\criterions, but is not infered. I tried to figure out why it happen.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.