Relation Extraction is a classification task. Given a sentence and a pair of entity, RE models aim to classify the relationship that is entailed from the context. VictorNLP-RE framework supplies various deep learning relation extraction models. In-depth description of resources and their implementation details are recorded right here:
VictorNLP, a PyTorch-based NLP framework, provides intuitive interfaces and carefully designed code modularity for implementing and pipelining NLP modules. For more details, refer to the VictorNLP Specification page.
- Clone the repository to your local computer.
git clone https://github.com/jinulee-v/victornlp_re
- Install
victornlp_re package
usingpip
.
cd victornlp_re
mkdir corpus model
pip install -e ./victornlp_re
Note: All commands in this documentation should be executed in the same directory.
ls
corpus model victornlp_re
- Install dependencies.
pip install -r victornlp_re/requirements.txt
cd ..
Refer to VictorNLP_Utils documentation.
Add downloaded files to victornlp_dp/victornlp_utils/data
as instructed.
If the script raises NotFoundError, precautious warnings containing missing dependencies are printed.
This script automatically reads config_file
and starts training. However, for convenience, overriding configuration file with command line arguments is supported. However, to reduce command line bizarreness, we only support major hyper-parameters. Details like model hyperparameters(num_layers, dropout, ...) are likely to be modified directly in victornlp_dp/config.json
or other config files.
python -m victornlp_re.train
Refer to python -m victornlp_re.train -h
for more information, and train.sh
for some demos.
This script also supports fine-tuning for different possible situations:
- Re-train with another corpus.
- Fine-tuning of pre-trained embeddings.
Refer to the following code snippet or
train-finetune.sh
for examples. You can override any cmd line argument-supported hyperparameter just as training, and you can also modifymodels/Example_name/config.json
for more precise and specific tuning.
python -m victornlp_re.train --finetune-model models/Example_name
ls models/Example
- model.pt
- config.json
Use this script to test model stats or parse sentences of interest. You only require --model-dir
argument which is a directory name that contains model.pt
and config.json
. These two files are automatically generated by the victornlp_re.train
script. Command line arguments can modify the script's behavior from testing to parsing. Details are given below.
python victornlp_re.test --model-dir ./model/(folder name)
Optional Command line arguments:
-a
,--analyze
: Can freely select which analyzing functions(victornlp_re/tools/analyze.py
) to apply. Test set requires golden labels--data-file
:path/to/input/file
. Load VictorNLP_format .json file to parse. If not provided, parsesstdin
inputs(line by line, PoS automatically tagged).--save-result
:path/to/store/parsed_file.json
. Stores parsed result in the designated path.