thu-keg / kepler Goto Github PK
View Code? Open in Web Editor NEWSource code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".
License: MIT License
Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".
License: MIT License
已经很久了,请问可以先发布一个版本吗?
i can't convert the pretrained KEPLER model(you provide) to HuggingFace's Transformers using fairseq=0.9.0 and transformers=2.2.2, because this code seems to be based on another new version.
Actually i've try convert_roberta_checkpoint_to_pytorch.py of transformers=2.2.2 with transformers=2.2.2, but i get "shape doesn't match" at a linear layer of decoder.
And i've try this code with transformers=4.7.0, and i get something as followings
Traceback (most recent call last): File "convert_kepler.py", line 177, in <module> args.roberta_checkpoint_path, args.pytorch_dump_folder_path, args.classification_head File "convert_kepler.py", line 54, in convert_roberta_checkpoint_to_pytorch roberta = FairseqRobertaModel.from_pretrained(roberta_checkpoint_path) File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/models/roberta/model.py", line 144, in from_pretrained **kwargs, File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/hub_utils.py", line 68, in from_pretrained arg_overrides=kwargs, File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 190, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 166, in load_checkpoint_to_cpu state = _upgrade_state_dict(state) File "/data/JiangZhiShu/miniconda3/envs/KEPLER/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 349, in _upgrade_state_dict registry.set_defaults(state["args"], tasks.TASK_REGISTRY[state["args"].task]) KeyError: 'MLMetKE'
So i want to know how to convert the pretrained KEPLER model correctly. Is this a version issue or something else?
It seems that all files from this link (https://deepgraphlearning.github.io/project/wikidata5m) are not available for download anymore. Does the author have a new download link?
您好,我想要问一下,在对模型进行训练的时候,提供的脚本中,有一段是加载roberta的检查点,但是我注意到是用.pt格式的,我想要问一下,是需要把pytorch_model.bin转为.pt格式吗,如果是的话,能否提供一下转换的代码呢?,因为要符合fairseq,我在网上找的都不能使用。谢谢
First of all, congratulations on your amazing work.
I have read your paper and I was wondering how much pre-training you spent on KEPLER. That is, how many steps was it trained for, how much time it took, and on what hardware. As far as I understand, you used a 12K batch size and RoBERTa
weights as a starting point, but I couldn't find information about the computational resources needed to train your model.
Thanks!
使用给出的示例转transformers模型结构时,出现KeyError: 'MLMetKE'的问题,想问一下怎么解决
只使用CUDA不行嘛
(目前还在下载各种数据集和公开的模型,暂时还没有运行代码)
What are the json files for these args --entity2id and --relation2id?
Hello,
How can i use KEPLER to generating knowledge embedding using the graph triplets without feeding the model any entity description ? From what I read from your paper, it was possible but I can not seem to find the instructions to do so in your repo ?
你好,请问执行预训练命令一直提醒找不到MLM/dict.txt,仔细查找了并没有任何命令可以得到dict.txt,包括MLM里面数据预处理的命令
When I try to transform the model from Fairseq to Huggingface Transformers with the provided script, the following error occurred:
AttributeError: 'RobertaModel' object has no attribute 'encoder'
But I've followed the README to install the provided Fairseq release.
Could anyone tell me how to solve this?
Hi, could you please provide entity type file for all entities in the Wiki triplets, so as to I can do some statistics like Table 2? Thanks!
after convert the pretrain model by the transformer.convert_roberta_original_pytorch_checkpoint_to_pytorch, i got config.json and pytorch_model.bin.
Then, i want to try the OpenEntity task, i set as this script '--model_name_or_path ./model_convert/' , but i got a error as the following
Did I miss some files or some settiings?
Great work!
I have several question about the construction of Wikidata5m.
How to align each entity in Wikidata to its Wikipedia page?And how to extract entities' descriptions?
Did you use any off-the-shelf tools?
Could you provide the construction codes for Wikidata5m?
Look forward to your reply! Thanks!
你好, I'm trying to establish if KEPLER improves on biomedical NER and if training it from scratch on a domain dataset would yield any improvements. Before I go through that though, I would like to play around with the pre-trained model, but it seems like I can't access Tsinghua Cloud from outside China.
Would you be willing to upload it to some cloud solution available abroad (e.g. Google Cloud) or suggest some workaround to access Tsinghua Cloud?
Much appreciated,
谢谢
It seems that the tail entity and the relation are swapped here. If I try to pretrain the model with data generated with this script I get an error because the relation indices are larger than the relation embedding matrix.
To train the data I run this script:
TOTAL_UPDATES=125000 # Total number of training steps
WARMUP_UPDATES=10000 # Warmup the learning rate over this many updates
LR=6e-04 # Peak LR for polynomial LR scheduler.
NUM_CLASSES=2
MAX_SENTENCES=3 # Batch size.
NUM_NODES=1
ROBERTA_PATH="path/to/roberta.base/model.pt" #Path to the original roberta model
CHECKPOINT_PATH="path/to/checkpoints" #Directory to store the checkpoints
UPDATE_FREQ=expr 784 / $NUM_NODES
# Increase the batch size
DATA_DIR=../Data
#Path to the preprocessed KE dataset, each item corresponds to a data directory for one epoch
KE_DATA=$DATA_DIR/KEI/KEI1_0:$DATA_DIR/KEI/KEI1_1:$DATA_DIR/KEI/KEI1_2:$DATA_DIR/KEI/KEI1_3:$DATA_DIR/KEI/KEI3_0:$DATA_DIR/KEI/KEI3_1:$DATA_DIR/KEI/KEI3_2:$DATA_DIR/KEI/KEI3_3:$DATA_DIR/KEI/KEI5_0:$DATA_DIR/KEI/KEI5_1:$DATA_DIR/KEI/KEI5_2:$DATA_DIR/KEI/KEI5_3:$DATA_DIR/KEI/KEI7_0:$DATA_DIR/KEI/KEI7_1:$DATA_DIR/KEI/KEI7_2:$DATA_DIR/KEI/KEI7_3:$DATA_DIR/KEI/KEI9_0:$DATA_DIR/KEI/KEI9_1:$DATA_DIR/KEI/KEI9_2:$DATA_DIR/KEI/KEI9_3:
DIST_SIZE=expr $NUM_NODES \* 4
fairseq-train $DATA_DIR/MLM \
--KEdata $KE_DATA \
--restore-file $ROBERTA_PATH
--save-dir $CHECKPOINT_PATH
--max-sentences $MAX_SENTENCES
--tokens-per-sample 512
--task MLMetKE \
--sample-break-mode complete
--required-batch-size-multiple 1
--arch roberta_base
--criterion MLMetKE
--dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01
--optimizer adam --adam-betas "(0.9, 0.98)" --adam-eps 1e-06
--clip-norm 0.0
--lr-scheduler polynomial_decay --lr $LR --total-num-update $TOTAL_UPDATES --warmup-updates $WARMUP_UPDATES
--update-freq $UPDATE_FREQ
--negative-sample-size 1 \ # Negative sampling size (one negative head and one negative tail)
--ke-model TransE \
--init-token 0
--separator-token 2
--gamma 4 \ # Margin of the KE objective
--nrelation 822
--skip-invalid-size-inputs-valid-test
--fp16 --fp16-init-scale 2 --threshold-loss-scale 1 --fp16-scale-window 128
--reset-optimizer --distributed-world-size ${DIST_SIZE} --ddp-backend no_c10d --distributed-port 23456
--log-format simple --log-interval 1
#--relation-desc #Add this option to encode the relation descriptions as relation embeddings (KEPLER-Rel in the paper)
I get an error: fairseq-train: error: the following arguments are required: --arch/-a
Environment
Python version: 3.9.12
fairseq latest version
PyTorch Version: 1.12.0
sklearn: 1.1.1
OS : Linux
Hi,
I'm curious, do you happen to know if there's a way to use a pre-trained model from Huggingface models and use that to initialize KEPLER training? I was hoping to initialize KEPLER with a RoBERTa pretrained on medical data and use KEPLER pretraining with medical knowledge graphs and medical MLM.
Many thanks,
Michal
I changed the convert code in line 56 from
roberta_sent_encoder = roberta.model.encoder.sentence_encoder
to
roberta_sent_encoder = roberta.model.decoder.sentence_encoder
However another error occurred:
AttributeError: 'MultiheadAttention' object has no attribute 'k_proj'
I'm not sure if we should modify the convert code first according to the kepler code and this is the right way to convert.
My transformers version is 2.0.0, and fairseq 0.9.0
请问 在inductive settings下train,valid,test集是怎么构建的,test里的所有实体都是unseen的吗
When I try:
python -m transformers.models.roberta.convert_roberta_original_pytorch_checkpoint_to_pytorch --roberta_checkpoint_path ./ --pytorch_dump_folder_path ./pytorch_model.bin
I got:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/transformers/models/roberta/convert_roberta_original_pytorch_checkpoint_to_pytorch.py", line 181, in
args.roberta_checkpoint_path, args.pytorch_dump_folder_path, args.classification_head
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/transformers/models/roberta/convert_roberta_original_pytorch_checkpoint_to_pytorch.py", line 58, in convert_roberta_checkpoint_to_pytorch
roberta = FairseqRobertaModel.from_pretrained(roberta_checkpoint_path)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/models/roberta/model.py", line 251, in from_pretrained
**kwargs,
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/hub_utils.py", line 72, in from_pretrained
arg_overrides=kwargs,
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 279, in load_model_ensemble_and_task
state = load_checkpoint_to_cpu(filename, arg_overrides)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 232, in load_checkpoint_to_cpu
state = _upgrade_state_dict(state)
File "/home/ubuntu/anaconda3/envs/pytorch/lib/python3.7/site-packages/fairseq/checkpoint_utils.py", line 434, in _upgrade_state_dict
registry.set_defaults(state["args"], tasks.TASK_REGISTRY[state["args"].task])
KeyError: 'MLMetKE'
Clearly, the script can't handle it. Could you please provide the converted weight? Many Thanks.
如果我想训练一个中文版的模型,请问你们是否有中文版的开源模型或者中文版的gpt2_bpe/vocab.bpe 文件,可否分享下?谢谢!
--entity2id: a json file that maps entity names (in the dataset) to the ids in the entity embedding numpy file, where the key is the entity names in the dataset, and the value is the id in the numpy file.
--relation2id: a json file that maps relation names (in the dataset) to the ids in the relation embedding numpy file.
where do these files come from?
Hi, I would like to ask, if I want to use relational classification in NLP tasks and I want to retrain the model without using the trained checkpoints you published, do I have to delete the same data as Fewrel when using KE_data in addition to following your steps to process the data? And does each period in Qdesc.txt represent a description of the entity? Does the next period represent this repetition of the description of the tail entity?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.