Similarity-based Memory Enhanced Joint Entity and Relation Extraction

Official PyTorch implementation of the paper "Similarity-based Memory Enhanced Joint Entity and Relation Extraction" accepted on the International Conference on Computational Science 2023.

Setup

Install requirements

To set up an environment first create the python 3.10 environment using conda or virtualenv, activate it, and install poetry using pip:

pip install poetry==1.4.2

make setup

To force the CUDA version of pytorch library run the following:

make setup-cuda

Init git submodules

We built our solution based on wonderful work from JEREX repository. We also used the code from Edge-oriented Graph repository to process the CDR dataset. We used the code as git submodules which You can initialize by running the following:

git submodule init && git submodule update --recursive

Because we import some code from the JEREX It might be useful to add the ./submodules/jerex directory to PYTHONPATH.

Set environment variables

Create the '.env' file based on '.env.example' examples and set the variables:

PRETRAINED_MODELS_DIR - The directory where the pretrained huggingface 🤗 models are stored;
WANDB_PROJECT_NAME - If you want to use Weights&Biases logging You can set the project name used by the logger;

Download datasets

You can download the datasets we used in our experiments in ready-to-use format:

./scripts/datasets/fetch_datasets.sh

Download pretrained models

You can download the datasets we used in our experiments in ready-to-use format:

./scripts/fetch_models.sh

Configuration

We used Hydra to create a hierarchical configuration for running our experiments. The ./config directory contains all the .yaml file used to run the scripts.

Example inference

You can make use of downloaded model checkpoint and run the inference on the CDR dataset using the following command:

python ./scripts/run.py --config-name memory_re/cdr/test

All the artifacts created during run will be logged into directory created in ./storage/runs. The script is configured to visualize predictions in .html file (using the code from JEREX) and visualize the memory activations for entities and mentions categories.

Training

You can run training on the CDR dataset using the following command:

python ./scripts/run.py --config-name memory_re/cdr/train

kosciukiewicz / similarity-based-memory-re Goto Github PK

similarity-based-memory-re's Introduction

Similarity-based Memory Enhanced Joint Entity and Relation Extraction

Setup

Install requirements

Init git submodules

Set environment variables

Download datasets

Download pretrained models

Configuration

Example inference

Training

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent