VictorNLP_RE: A multi-model, multi-lingual relation extraction package.

Summary

Relation Extraction

Relation Extraction is a classification task. Given a sentence and a pair of entity, RE models aim to classify the relationship that is entailed from the context. VictorNLP-RE framework supplies various deep learning relation extraction models. In-depth description of resources and their implementation details are recorded right here:

VictorNLP_RE Documentation

VictorNLP Framework

VictorNLP, a PyTorch-based NLP framework, provides intuitive interfaces and carefully designed code modularity for implementing and pipelining NLP modules. For more details, refer to the VictorNLP Specification page.

VictorNLP Specification

Installation

Clone the repository to your local computer.

git clone https://github.com/jinulee-v/victornlp_re

Install victornlp_re package using pip.

cd victornlp_re
mkdir corpus model
pip install -e ./victornlp_re

Note: All commands in this documentation should be executed in the same directory.

ls

corpus model victornlp_re

Install dependencies.

pip install -r victornlp_re/requirements.txt
cd ..

How-to:

Prepare Datasets and pretrained embeddings

Dataset: TODO

Pretrained embeddings

Refer to VictorNLP_Utils documentation.

Add downloaded files to victornlp_dp/victornlp_utils/data as instructed.

If the script raises NotFoundError, precautious warnings containing missing dependencies are printed.

Train model

This script automatically reads config_file and starts training. However, for convenience, overriding configuration file with command line arguments is supported. However, to reduce command line bizarreness, we only support major hyper-parameters. Details like model hyperparameters(num_layers, dropout, ...) are likely to be modified directly in victornlp_dp/config.json or other config files.

python -m victornlp_re.train

Refer to python -m victornlp_re.train -h for more information, and train.sh for some demos.

Fine-tune model

This script also supports fine-tuning for different possible situations:

Re-train with another corpus.
Fine-tuning of pre-trained embeddings. Refer to the following code snippet or train-finetune.sh for examples. You can override any cmd line argument-supported hyperparameter just as training, and you can also modify models/Example_name/config.json for more precise and specific tuning.

python -m victornlp_re.train --finetune-model models/Example_name

ls models/Example

model.pt

config.json

Test model

Use this script to test model stats or parse sentences of interest. You only require --model-dir argument which is a directory name that contains model.pt and config.json. These two files are automatically generated by the victornlp_re.train script. Command line arguments can modify the script's behavior from testing to parsing. Details are given below.

python victornlp_re.test --model-dir ./model/(folder name)

Optional Command line arguments:

-a, --analyze : Can freely select which analyzing functions(victornlp_re/tools/analyze.py) to apply. Test set requires golden labels
--data-file : path/to/input/file. Load VictorNLP_format .json file to parse. If not provided, parses stdin inputs(line by line, PoS automatically tagged).
--save-result : path/to/store/parsed_file.json. Stores parsed result in the designated path.

jinulee-v / victornlp_re Goto Github PK

victornlp_re's Introduction

VictorNLP_RE: A multi-model, multi-lingual relation extraction package.

Summary

Relation Extraction

VictorNLP Framework

Installation

How-to:

Prepare Datasets and pretrained embeddings

Dataset: TODO

Pretrained embeddings

Train model

Fine-tune model

Test model

victornlp_re's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent