Code Monkey home page Code Monkey logo

kepler's People

Contributors

bakser avatar gaotianyu1350 avatar michalpitr avatar trellixvulnteam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kepler's Issues

Can't convert the KEPLER correctly

i can't convert the pretrained KEPLER model(you provide) to HuggingFace's Transformers using fairseq=0.9.0 and transformers=2.2.2, because this code seems to be based on another new version.

  1. Actually i've try convert_roberta_checkpoint_to_pytorch.py of transformers=2.2.2 with transformers=2.2.2, but i get "shape doesn't match" at a linear layer of decoder.

  2. And i've try this code with transformers=4.7.0, and i get something as followings
    Traceback (most recent call last): File "convert_kepler.py", line 177, in <module> args.roberta_checkpoint_path, args.pytorch_dump_folder_path, args.classification_head File "convert_kepler.py", line 54, in convert_roberta_checkpoint_to_pytorch roberta = FairseqRobertaModel.from_pretrained(roberta_checkpoint_path) File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/models/roberta/model.py", line 144, in from_pretrained **kwargs, File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/hub_utils.py", line 68, in from_pretrained arg_overrides=kwargs, File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 190, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 166, in load_checkpoint_to_cpu state = _upgrade_state_dict(state) File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 349, in _upgrade_state_dict registry.set_defaults(state["args"], tasks.TASK_REGISTRY[state["args"].task]) KeyError: 'MLMetKE'

So i want to know how to convert the pretrained KEPLER model correctly. Is this a version issue or something else?

roberta的pytorch_model.bin格式转为.pt格式

您好,我想要问一下,在对模型进行训练的时候,提供的脚本中,有一段是加载roberta的检查点,但是我注意到是用.pt格式的,我想要问一下,是需要把pytorch_model.bin转为.pt格式吗,如果是的话,能否提供一下转换的代码呢?,因为要符合fairseq,我在网上找的都不能使用。谢谢

Pre-training cost of KEPLER

First of all, congratulations on your amazing work.

I have read your paper and I was wondering how much pre-training you spent on KEPLER. That is, how many steps was it trained for, how much time it took, and on what hardware. As far as I understand, you used a 12K batch size and RoBERTa weights as a starting point, but I couldn't find information about the computational resources needed to train your model.

Thanks!

NCCL是必须要有的吗

只使用CUDA不行嘛
(目前还在下载各种数据集和公开的模型,暂时还没有运行代码)

Generating knowledge embeddings with KEPLER

Hello,
How can i use KEPLER to generating knowledge embedding using the graph triplets without feeding the model any entity description ? From what I read from your paper, it was possible but I can not seem to find the instructions to do so in your repo ?

MLM/dict.txt

你好,请问执行预训练命令一直提醒找不到MLM/dict.txt,仔细查找了并没有任何命令可以得到dict.txt,包括MLM里面数据预处理的命令

Transform models to transformer

When I try to transform the model from Fairseq to Huggingface Transformers with the provided script, the following error occurred:

AttributeError: 'RobertaModel' object has no attribute 'encoder'

But I've followed the README to install the provided Fairseq release.

Could anyone tell me how to solve this?

WikiData5M国内下载不稳定

您好,在国内下载WikiData5M时总是下没多少就断开了,请问作者能提供一个在国内比较稳定的下载源吗?
image

about the entity type

Hi, could you please provide entity type file for all entities in the Wiki triplets, so as to I can do some statistics like Table 2? Thanks!

problems with the OpenEntity Task(Typing)

after convert the pretrain model by the transformer.convert_roberta_original_pytorch_checkpoint_to_pytorch, i got config.json and pytorch_model.bin.
image
Then, i want to try the OpenEntity task, i set as this script '--model_name_or_path ./model_convert/' , but i got a error as the following
image

Did I miss some files or some settiings?

The construction of Wikidata5m

Great work!

I have several question about the construction of Wikidata5m.
How to align each entity in Wikidata to its Wikipedia page?And how to extract entities' descriptions?
Did you use any off-the-shelf tools?
Could you provide the construction codes for Wikidata5m?

Look forward to your reply! Thanks!

Tsinghua Cloud pre-trained model unavailable from abroad

你好, I'm trying to establish if KEPLER improves on biomedical NER and if training it from scratch on a domain dataset would yield any improvements. Before I go through that though, I would like to play around with the pre-trained model, but it seems like I can't access Tsinghua Cloud from outside China.
Would you be willing to upload it to some cloud solution available abroad (e.g. Google Cloud) or suggest some workaround to access Tsinghua Cloud?

Much appreciated,
谢谢

Unable to train the model because of fairseq-train: error: the following arguments are required: --arch/-a

To train the data I run this script:

TOTAL_UPDATES=125000 # Total number of training steps
WARMUP_UPDATES=10000 # Warmup the learning rate over this many updates
LR=6e-04 # Peak LR for polynomial LR scheduler.
NUM_CLASSES=2
MAX_SENTENCES=3 # Batch size.
NUM_NODES=1
ROBERTA_PATH="path/to/roberta.base/model.pt" #Path to the original roberta model
CHECKPOINT_PATH="path/to/checkpoints" #Directory to store the checkpoints
UPDATE_FREQ=expr 784 / $NUM_NODES # Increase the batch size

DATA_DIR=../Data

#Path to the preprocessed KE dataset, each item corresponds to a data directory for one epoch
KE_DATA=$DATA_DIR/KEI/KEI1_0:$DATA_DIR/KEI/KEI1_1:$DATA_DIR/KEI/KEI1_2:$DATA_DIR/KEI/KEI1_3:$DATA_DIR/KEI/KEI3_0:$DATA_DIR/KEI/KEI3_1:$DATA_DIR/KEI/KEI3_2:$DATA_DIR/KEI/KEI3_3:$DATA_DIR/KEI/KEI5_0:$DATA_DIR/KEI/KEI5_1:$DATA_DIR/KEI/KEI5_2:$DATA_DIR/KEI/KEI5_3:$DATA_DIR/KEI/KEI7_0:$DATA_DIR/KEI/KEI7_1:$DATA_DIR/KEI/KEI7_2:$DATA_DIR/KEI/KEI7_3:$DATA_DIR/KEI/KEI9_0:$DATA_DIR/KEI/KEI9_1:$DATA_DIR/KEI/KEI9_2:$DATA_DIR/KEI/KEI9_3:

DIST_SIZE=expr $NUM_NODES \* 4

fairseq-train $DATA_DIR/MLM \
--KEdata $KE_DATA \
--restore-file $ROBERTA_PATH
--save-dir $CHECKPOINT_PATH
--max-sentences $MAX_SENTENCES
--tokens-per-sample 512
--task MLMetKE \
--sample-break-mode complete
--required-batch-size-multiple 1
--arch roberta_base
--criterion MLMetKE
--dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01
--optimizer adam --adam-betas "(0.9, 0.98)" --adam-eps 1e-06
--clip-norm 0.0
--lr-scheduler polynomial_decay --lr $LR --total-num-update $TOTAL_UPDATES --warmup-updates $WARMUP_UPDATES
--update-freq $UPDATE_FREQ
--negative-sample-size 1 \ # Negative sampling size (one negative head and one negative tail)
--ke-model TransE \
--init-token 0
--separator-token 2
--gamma 4 \ # Margin of the KE objective
--nrelation 822
--skip-invalid-size-inputs-valid-test
--fp16 --fp16-init-scale 2 --threshold-loss-scale 1 --fp16-scale-window 128
--reset-optimizer --distributed-world-size ${DIST_SIZE} --ddp-backend no_c10d --distributed-port 23456
--log-format simple --log-interval 1
#--relation-desc #Add this option to encode the relation descriptions as relation embeddings (KEPLER-Rel in the paper)

I get an error: fairseq-train: error: the following arguments are required: --arch/-a

Environment

Python version: 3.9.12
fairseq  latest version
PyTorch Version: 1.12.0
sklearn: 1.1.1
OS : Linux

Continued pretraining from pytorch model

Hi,
I'm curious, do you happen to know if there's a way to use a pre-trained model from Huggingface models and use that to initialize KEPLER training? I was hoping to initialize KEPLER with a RoBERTa pretrained on medical data and use KEPLER pretraining with medical knowledge graphs and medical MLM.

Many thanks,
Michal

Transform fairseq model to huggingface's transformers model

I changed the convert code in line 56 from

roberta_sent_encoder = roberta.model.encoder.sentence_encoder

to

roberta_sent_encoder = roberta.model.decoder.sentence_encoder

However another error occurred:

AttributeError: 'MultiheadAttention' object has no attribute 'k_proj'

I'm not sure if we should modify the convert code first according to the kepler code and this is the right way to convert.

My transformers version is 2.0.0, and fairseq 0.9.0

Failed to convert your checkpoint.

When I try:
python -m transformers.models.roberta.convert_roberta_original_pytorch_checkpoint_to_pytorch --roberta_checkpoint_path ./ --pytorch_dump_folder_path ./pytorch_model.bin

I got:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/transformers/models/roberta/convert_roberta_original_pytorch_checkpoint_to_pytorch.py", line 181, in
args.roberta_checkpoint_path, args.pytorch_dump_folder_path, args.classification_head
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/transformers/models/roberta/convert_roberta_original_pytorch_checkpoint_to_pytorch.py", line 58, in convert_roberta_checkpoint_to_pytorch
roberta = FairseqRobertaModel.from_pretrained(roberta_checkpoint_path)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/models/roberta/model.py", line 251, in from_pretrained
**kwargs,
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/hub_utils.py", line 72, in from_pretrained
arg_overrides=kwargs,
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 279, in load_model_ensemble_and_task
state = load_checkpoint_to_cpu(filename, arg_overrides)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 232, in load_checkpoint_to_cpu
state = _upgrade_state_dict(state)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 434, in _upgrade_state_dict
registry.set_defaults(state["args"], tasks.TASK_REGISTRY[state["args"].task])
KeyError: 'MLMetKE'

Clearly, the script can't handle it. Could you please provide the converted weight? Many Thanks.

KE Evaluation

--entity2id: a json file that maps entity names (in the dataset) to the ids in the entity embedding numpy file, where the key is the entity names in the dataset, and the value is the id in the numpy file.
--relation2id: a json file that maps relation names (in the dataset) to the ids in the relation embedding numpy file.

where do these files come from?

relation classification

Hi, I would like to ask, if I want to use relational classification in NLP tasks and I want to retrain the model without using the trained checkpoints you published, do I have to delete the same data as Fewrel when using KE_data in addition to following your steps to process the data? And does each period in Qdesc.txt represent a description of the entity? Does the next period represent this repetition of the description of the tail entity?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.