Code Monkey home page Code Monkey logo

multi2oie's People

Contributors

youngbin-ro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

multi2oie's Issues

Possible inconsistency between paper and repository

I recognised a difference between your code and the Paper Multi^2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT.
While the paper uses key and value (both masked BERT hidden states) and an unmasked query (BERT hidden state) as inputs for the MultiHeadAttention Blocks, your code uses the same query and value (both BERT hidden states) and a key (masked BERT hidden state).

Paper:
key = value = bert_hidden[masked]
query = bert_hidden

Your Repository:
query = value = bert_hidden
key = bert_hidden[masked]

Maybe I understood something wrong, but why did you change this part of the architecture?

Preprocessing run until 100% but only produce 0KB open4_train.pkl.

Hallo. I followed the step, unfortunately in preprocessing stage can not generate the open4_train.pkl as expected.
cd utils python preprocess.py \ --mode 'train' \ --data '../datasets/structured_data.json' \ --save_path '../datasets/openie4_train.pkl' \ --bert_config 'bert-base-cased' \ --max_len 64

I suspect there is a problem with the jsonschema package that affected this issue. There is a version incompatibility there. At the Environmental Setup stage using pip install -r requirements.txt, an error appears like this:
ERROR: nbclient 0.5.3 has requirement jupyter-client> = 6.1.5, but you'll have jupyter-client 5.3.5 which is incompatible. ERROR: datascience 0.10.6 has requirement folium == 0.2.1, but you'll have folium 0.8.3 which is incompatible. ERROR: albumentations 0.1.12 has requirement imgaug <0.2.7,> = 0.2.5, but you'll have imgaug 0.2.9 which is incompatible.

But I'm not sure what causes this issue, yet.

Format of data for test.py

Thank you for your amazing work.
I have troubles in launching trained model for my dataset. I have dataset in .txt format, that I should to do to convert it to .pkl format properly?

Having trouble replicating results with file structuring

I have noticed some slight inconsistencies when trying to replicate the results of this model starting from testing. When the general reader script is run it takes in "openie4_test.txt" but this file does not seem to exist in the repository. Do you know why this is?

A problem about the carb

hello, I meet a problem about the carb. The error info is following. Thanks for your help!

/code/XXX/miniconda3/envs/multi2oie/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.preprocessing.data module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.preprocessing. Anything that cannot be imported from sklearn.preprocessing is now part of the private API.
  warnings.warn(message, FutureWarning)
[nltk_data] Error loading wordnet: <urlopen error [Errno 111]
[nltk_data]     Connection refused>
Traceback (most recent call last):
  File "train.py", line 8, in <module>
    from test import do_eval
  File "/home/wurui/spacy_ie/Multi2OIE-master/test.py", line 13, in <module>
    from carb.tabReader import TabReader
ModuleNotFoundError: No module named 'carb.tabReader'

If you suspect this is an IPython 7.14.0 bug, please report it at:
    https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
    %config Application.verbose_crash=True

can't reproduce the result

I run the project using the parameters you provided(some parameters you didn't provide, so I just use the default value in the code). The configuration of parameters are listed as follows:
{
"seed": 1,
"save_path": "./results",
"bert_config": "bert-base-cased",
"trn_data_path": "./datasets/openie4_train.pkl",
"dev_data_path": [
"./datasets/oie2016_dev.pkl",
"./datasets/carb_dev.pkl"
],
"dev_gold_path": [
"./evaluate/OIE2016_dev.txt",
"./carb/CaRB_dev.tsv"
],
"max_len": 64,
"device": "cuda",
"visible_device": "2",
"summary_step": 100,
"use_lstm": false,
"binary": false,
"epochs": 1,
"lstm_dropout": 0.0,
"mh_dropout": 0.2,
"pred_clf_dropout": 0.0,
"arg_clf_dropout": 0.2,
"batch_size": 64,
"dev_batch_size": 128,
"learning_rate": 3e-05,
"n_arg_heads": 8,
"n_arg_layers": 4,
"pos_emb_dim": 64,
"pred_n_labels": 3,
"arg_n_labels": 9,
"total_steps": 33791,
"warmup_steps": 3379
}

And I got the best dev results in the first 1000 step and then the F1-score decreased. The results are shown in the following figure.
image
I'm wondering if there were some parameters I set wrong or if there are some mistakes for your provided parameters.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.