Code Monkey home page Code Monkey logo

kopa's People

Contributors

greyovo avatar wencolani avatar zhang-each avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

kopa's Issues

这篇工作很棒,但是我在复现过程中出现了这个问题是因为1024和512不匹配吗(报错中加粗的部分)

Training Alpaca-LoRA model with params:
base_model: /data2/yuhang/huggingface/hub/models--alpaca-7b
data_path: data/UMLS-train.json
output_dir: /data2/junhong/proj/LLM-KGC/KoPA-main/model/kopa-finetune
batch_size: 12
micro_batch_size: 12
num_epochs: 3
learning_rate: 0.0003
cutoff_len: 512
val_set_size: 0
lora_r: 32
num_prefix: 1
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']
train_on_inputs: True
add_eos_token: False
group_by_length: False
wandb_project:
wandb_run_name:
wandb_watch:
wandb_log_model:
resume_from_checkpoint: False
prompt template: alpaca
kge model: data/UMLS-rotate.pth

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:25<00:00, 8.52s/it]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in huggingface/transformers#24565
1024 512
Adapter Trained From Scratch
Map: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 15648/15648 [00:10<00:00, 1520.20 examples/s]
Using the WANDB_DISABLED environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
0%| | 0/1956 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/data2/junhong/proj/LLM-KGC/KoPA-main/finetune_kopa.py", line 282, in
fire.Fire(train)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/data2/junhong/proj/LLM-KGC/KoPA-main/finetune_kopa.py", line 271, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/trainer.py", line 1591, in train
return inner_training_loop(
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/trainer.py", line 1892, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/trainer.py", line 2776, in training_step
loss = self.compute_loss(model, inputs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/trainer.py", line 2801, in compute_loss
outputs = model(**inputs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 185, in forward
outputs = self.parallel_apply(replicas, inputs, module_kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 200, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 110, in parallel_apply
output.reraise()
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/_utils.py", line 694, in reraise
raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in _worker
output = module(*input, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data2/junhong/proj/LLM-KGC/KoPA-main/kopa.py", line 106, in forward
return forward_call(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/peft/peft_model.py", line 918, in forward
return self.base_model(
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 94, in forward
return self.model.forward(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1038, in forward
outputs = self.model(
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 925, in forward
layer_outputs = decoder_layer(
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 635, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 349, in forward
query_states = self.q_proj(hidden_states)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/data2/junhong/anaconda3/envs/kopa/lib/python3.10/site-packages/peft/tuners/lora.py", line 902, in forward
result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

0%| | 0/1956 [00:05<?, ?it/s]

代码复现遇到一些问题

您好,我们按readme中的步骤和论文中相同的参数设置,运行了训练和预测的代码。但在预测中,有遇到两个问题:1. 模型对所有三元组都输出True,即使对于训练集的数据也只输出True,虽然训练过程中loss一直在下降;2. 模型预测时有一个warning(A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side='left' when initializing the tokenizer.)在tokenizer中设置了padding_side='left也没有用。想问问你们有遇到这样的问题吗。

dataset_info.json文件不存在

您好, 我在复现代码的时候又遇到了一些问题, 报错显示 dataset_info.json 这个文件不存在, 但是我似乎并没有找到哪个地方会生成或者可以获取到这个文件

Training Alpaca-LoRA model with params:
base_model: models/alpaca-7b-wdiff
data_path: data/UMLS-train.json
output_dir: out/finetune_kopa
batch_size: 12
micro_batch_size: 12
num_epochs: 3
learning_rate: 0.0003
cutoff_len: 512
val_set_size: 0
lora_r: 32
num_prefix: 1
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']
train_on_inputs: True
add_eos_token: False
group_by_length: False
wandb_project:
wandb_run_name:
wandb_watch:
wandb_log_model:
resume_from_checkpoint: False
prompt template: alpaca
kge model: data/UMLS-rotate.pth

Loading checkpoint shards: 100%|██████████| 3/3 [00:38<00:00, 12.76s/it]
/home/guochenxu/pythonProjects/KoPA-main/process_kge.py:8: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
ent_embs = torch.tensor(kge_model["ent_embeddings.weight"]).cpu()
/home/guochenxu/pythonProjects/KoPA-main/process_kge.py:9: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
rel_embs = torch.tensor(kge_model["rel_embeddings.weight"]).cpu()
1024 512
Adapter Trained From Scratch
Traceback (most recent call last):
File "/home/guochenxu/pythonProjects/KoPA-main/finetune_kopa.py", line 279, in
fire.Fire(train)
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/guochenxu/pythonProjects/KoPA-main/finetune_kopa.py", line 182, in train
data = load_dataset("json")
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/datasets/load.py", line 1759, in load_dataset
builder_instance = load_dataset_builder(
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/datasets/load.py", line 1522, in load_dataset_builder
builder_instance: DatasetBuilder = builder_cls(
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/datasets/builder.py", line 363, in init
self.info = DatasetInfo.from_directory(self._cache_dir)
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/datasets/info.py", line 358, in from_directory
with fs.open(path_join(dataset_info_dir, config.DATASET_INFO_FILENAME), "r", encoding="utf-8") as f:
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fsspec/spec.py", line 1295, in open
self.open(
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fsspec/spec.py", line 1307, in open
f = self._open(
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fsspec/implementations/local.py", line 180, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fsspec/implementations/local.py", line 302, in init
self._open()
File "/home/guochenxu/anaconda310/lib/python3.10/site-packages/fsspec/implementations/local.py", line 307, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/guochenxu/.cache/huggingface/datasets/json/default-ae25584a5d8560de/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51/dataset_info.json'

我的模型是根据 stanford_alpaca 的说明生成的模型:

image

执行的命令如下:

export WANDB_DISABLED=true
wandb offline
CUDA_VISIBLE_DEVICES=0 nohup python finetune_kopa.py \
    --base_model 'models/alpaca-7b' \
    --data_path 'data/UMLS-train.json' \
    --output_dir 'out/finetune_kopa' \
    --num_epochs 3 \
    --lora_r 32 \
    --learning_rate 3e-4 \
    --batch_size 12 \
    --micro_batch_size 12 \
    --num_prefix 1 \
    --kge_model 'data/UMLS-rotate.pth' \
    --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' > log.txt &

训练阶段loss为0,推理阶段输出为空

你好我在复现代码的时候发现训练阶段loss从第二次输出开始全部为0:
{'loss': 1.5336, 'learning_rate': 2.9999999999999997e-05, 'epoch': 0.01}
{'loss': 0.0, 'learning_rate': 5.9999999999999995e-05, 'epoch': 0.01}
{'loss': 0.0, 'learning_rate': 8.999999999999999e-05, 'epoch': 0.02}
同时训练结束在推理阶段运行输出为空,打印具体输出发现内容是类似<s><unk><unk><unk><unk>这样的无意义序列。
请问这可能是什么原因导致的?

请问kge模型的训练过程。

您好,
论文中Structural Embedding Pre-training应该就是指代码中的kge_model吧,但是我没有找到如何训练嵌入模型,请问能否提供嵌入模型的训练代码?谢谢

推理时 RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

你好,推理时,生成报如下错误,请问该如何解决?
Traceback (most recent call last):
File "/data/sdkjualk/pymp_kopa/inference_kopa.py", line 78, in
generate_ids = model.generate(
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/peft/peft_model.py", line 1034, in generate
outputs = self.base_model.generate(**kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/transformers/generation/utils.py", line 1538, in generate
return self.greedy_search(
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/transformers/generation/utils.py", line 2362, in greedy_search
outputs = self(
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 824, in forward
logits = self.lm_head(hidden_states)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/tyy/anaconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 116, in forward
return F.linear(input, weight, bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

总是出现这个问题KeyError: 'ent_re_embeddings.weight'

你的工作非常出色,我也在学习您的工作,在复现过程中出现了一些问题,麻烦您帮忙看看,谢谢
base_model: str = "tatsu-lab/alpaca-7b-wdiff" 或者base_model: str = "huggyllama/llama-7b" 两个预训练模型我都尝试过,都会出现加载预训练知识图嵌入(KGE)模型时,试图通过键值 'ent_re_embeddings.weight' 访问字典 kge_model 的条目时发生了 KeyError。
Training Alpaca-LoRA model with params:
base_model: /root/autodl-fs/alpaca-7b-wdiff
data_path: /root/KoPA/data/UMLS-train.json
output_dir: ./lora-alpaca
batch_size: 16
micro_batch_size: 16
num_epochs: 2
learning_rate: 0.0003
cutoff_len: 512
val_set_size: 0
lora_r: 16
num_prefix: 1
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['q_proj', 'v_proj']
train_on_inputs: True
add_eos_token: False
group_by_length: False
wandb_project:
wandb_run_name:
wandb_watch:
wandb_log_model:
resume_from_checkpoint: False
prompt template: alpaca
kge model: /root/KoPA/data/UMLS-rotate.pth

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████| 6/6 [00:06<00:00, 1.14s/it]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in huggingface/transformers#24565
Traceback (most recent call last):
File "finetune_kopa.py", line 281, in
fire.Fire(train)
File "/root/miniconda3/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/miniconda3/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/miniconda3/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "finetune_kopa.py", line 181, in train
slama_model = KoPAWithAdapter(model, num_prefix, kge_model=kge_model)
File "/root/KoPA/kopa.py", line 70, in init
ent_embs, rel_embs = load_pretrain_kge(kge_model)
File "/root/KoPA/process_kge.py", line 6, in load_pretrain_kge
return load_complex_model(path)
File "/root/KoPA/process_kge.py", line 24, in load_complex_model
ent_embs1 = torch.tensor(kge_model["ent_re_embeddings.weight"]).cpu()
KeyError: 'ent_re_embeddings.weight'
请帮忙看看哪里出现了问题,谢谢

复现代码时出现 can't load tokenizer

作者好, 我想问一下你们用的是 tloen/alpaca-lora-7b 这个模型吗, 我在运行代码的时候报了下面这个错误:

image

我的模型文件如下 (里面似乎确实没有 tokenizer 文件):

image

我的运行指令如下:

CUDA_VISIBLE_DEVICES=0 nohup python finetune_kopa.py \
    --base_model '/home/guochenxu/pythonProjects/KoPA-main/models/lora_7b' \
    --data_path 'data/UMLS-train.json' \
    --output_dir '/home/guochenxu/pythonProjects/KoPA-main/out/finetune_kopa' \
    --num_epochs 3 \
    --lora_r 32 \
    --learning_rate 3e-4 \
    --batch_size 12 \
    --micro_batch_size 12 \
    --num_prefix 1 \
    --kge_model 'data/UMLS-rotate.pth' \
    --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' >log.txt &

所以想请教一下这些文件要从哪里去获取呢 (alpaca-lora那边也问了, 但是目前还没有回复)

另外如果可以的话, 我还想请问一下你们使用的 python/transformers/tokenizer/peft等相关库的具体版本, 因为之前运行的时候我似乎还报了一些其他的错误, 但是我换了几个版本的库以后就没再报过这些错了, 为了以防万一所以想再确认一下具体版本😂

感激不尽🙏

关于模型输出问题

重新复现了您的工作,出现一个问题,请教一下:
按照paper中的实验设置,在A6000中训练了4个epoch,在推理阶段模型对每一个prompt的输出完全不在预计之内(如下图),最终得到的acc, p, r, f1分别为0.5, 0, 0, 0。请问这个问题可能出在哪?谢谢。
2024-01-09_225650

ValueError: Target modules [q_proj,k_proj,v_proj,o_proj] not found in the base model.

请问您用的是https://huggingface.co/wxjiao/alpaca-7b这个模型吗?我在加载模型时出现了问题
Traceback (most recent call last):
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/zzu_zxw/zjl_data/KoPA/finetune_kopa.py", line 178, in train
model = get_peft_model(model, config)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/peft/mapping.py", line 112, in get_peft_model
return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/peft/peft_model.py", line 647, in init
super().init(model, peft_config, adapter_name)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/peft/peft_model.py", line 91, in init
self.base_model = PEFT_TYPE_TO_MODEL_MAPPING[peft_config.peft_type](
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/peft/tuners/lora.py", line 131, in init
self.add_adapter(adapter_name, self.peft_config[adapter_name])
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/peft/tuners/lora.py", line 137, in add_adapter
self._find_and_replace(adapter_name)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/peft/tuners/lora.py", line 222, in _find_and_replace
raise ValueError(
ValueError: Target modules [q_proj,k_proj,v_proj,o_proj] not found in the base model. Please check the target modules and try again.
python-BaseException

进程已结束,退出代码1

the method are in doubt

Thank you for your method. I repeated this method and used it for entity and relationship prediction. I found that the effect was not even as good as directly fine-tuning the model. What is the reason for this

微调后的结果无法达到预期

使用 finetune_kopa.py 脚本,使用 CoDeX 的数据集微调 llama-7B 后进行推理,得出的结果为:

acc p r f1
0.5043763676148797 0.5022038567493113 0.9972647702407003 0.6680102601685599

🤣 与KoPA的结果不一致,但与不微调的 Zero-shot 的结果出奇地一致,是我微调的设置有误么?

image

微调参数如下。其中 batch_size 由 12 调整为了 6(因为硬件资源有限)不知道有没有影响?

export WANDB_DISABLED=true
wandb offline
CUDA_VISIBLE_DEVICES=0 nohup python finetune_kopa.py \
    --base_model 'baffo32/decapoda-research-llama-7B-hf' \
    --data_path 'data/CoDeX-S-train.json' \
    --output_dir 'output-CoDeX' \
    --num_epochs 3 \
    --lora_r 64 \
    --learning_rate 3e-4 \
    --batch_size 6 \ 
    --micro_batch_size 6 \
    --num_prefix 1 \
    --kge_model 'data/CoDeX-S-rotate.pth' \
    --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' > log-CoDeX.log &

推理的代码基于 inference_kopa.py 修改:

base_path = 'baffo32/decapoda-research-llama-7B-hf'
if __name__ == "__main__":
    cuda = "cuda:0"
    lora_weights = "output-CoDeX"
    test_data_path = "data/CoDeX-S-test.json"
    embedding_path = "{}/embeddings.pth".format(lora_weights)
    test_dataset = load_test_dataset(test_data_path)
    kg_embeddings = torch.load(embedding_path).to(cuda)
    tokenizer = LlamaTokenizer.from_pretrained(base_path)
    model = LlamaForCausalLM.from_pretrained(
        base_path,
        torch_dtype=torch.float16
    ).to(cuda)
    model = PeftModel.from_pretrained(
        model,
        lora_weights,
        torch_dtype=torch.float16,
    ).to(cuda)

出现问题:NotImplementedError: Cannot copy out of meta tensor; no data! 请作者帮忙解决!

出现这个问题,我看了之前的issue,调整transformers库版本 4.28.0, torch版本2.0.0,还是不能解决问题。请原作者查看源代码!
root@autodl-container-21104cb00b-08b6ed91:~/autodl-tmp/KoPA# python finetune_kopa.py
Training Alpaca-LoRA model with params:
base_model: huggyllama/llama-7b
data_path: data/UMLS-train.json
output_dir: data/save
batch_size: 16
micro_batch_size: 16
num_epochs: 2
learning_rate: 0.0003
cutoff_len: 512
val_set_size: 0
lora_r: 16
num_prefix: 1
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: ['q_proj', 'v_proj']
train_on_inputs: True
add_eos_token: False
group_by_length: False
wandb_project:
wandb_run_name:
wandb_watch:
wandb_log_model:
resume_from_checkpoint: False
prompt template: alpaca
kge model: data/UMLS-rotate.pth

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.97s/it]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in huggingface/transformers#24565
Model parameters device before moving to CUDA: cuda:0
Model parameters device after moving to CUDA: cuda:0
/root/autodl-tmp/KoPA/process_kge.py:10: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
ent_embs = torch.tensor(kge_model["ent_embeddings.weight"])
/root/autodl-tmp/KoPA/process_kge.py:11: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
rel_embs = torch.tensor(kge_model["rel_embeddings.weight"])
1024 512
Adapter Trained From Scratch
Map: 100%|██████████████████████████████████████████████████████████████████████████████| 15648/15648 [00:14<00:00, 1047.00 examples/s]
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
Traceback (most recent call last):
File "finetune_kopa.py", line 288, in
fire.Fire(train)
File "/root/miniconda3/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/miniconda3/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/miniconda3/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "finetune_kopa.py", line 236, in train
trainer = transformers.Trainer(
File "/root/miniconda3/lib/python3.8/site-packages/transformers/trainer.py", line 481, in init
self._move_model_to_device(model, args.device)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/trainer.py", line 716, in _move_model_to_device
model = model.to(device)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1160, in to
return self._apply(convert)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
[Previous line repeated 6 more times]
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 833, in _apply
param_applied = fn(param)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1158, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!

NotImplementedError: Cannot copy out of meta tensor; no data!

请问出现下边的报错是因为transformer版本问题吗?您使用的是什么版本呢?
Traceback (most recent call last):
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/zzu_zxw/zjl_data/KoPA/finetune_kopa.py", line 232, in train
trainer = transformers.Trainer(
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/transformers/trainer.py", line 498, in init
self._move_model_to_device(model, args.device)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/transformers/trainer.py", line 740, in _move_model_to_device
model = model.to(device)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1145, in to
return self._apply(convert)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
[Previous line repeated 7 more times]
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
File "/home/zzu_zxw/miniconda3/envs/alpaca/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!

进程已结束,退出代码1

给定query, 实现头/尾实体预测

非常感谢您全面又有指示性的工作。关于实体预测这个任务,我觉得可以对所有的候选三元组进行打分,该打分就是三元组分为正类的概率。不知道这样想是否正确?期待您的解答

wandb: 未找到命令

您好,请问.sh脚本里面的wandb是否需要设置,因为我运行报错,网上看到说要填写wandb api之类的,还是说此代码只需要pip install wandb就行了
image

关于负采样

这是一项非常有意义的工作!为了更好地复现这篇工作,我有几个问题:

  1. 模型训练过程用到了负采样,请问目前的实验结果(arxiv preprint版本)使用的训练过程负采样比例是多少?
  2. 负采样的方法是什么?
  3. 如果负采样导致了正负例样本不均衡,训练过程中会不会进行重采样 / 欠采样,batch的构造完全随机嘛?

训练相关问题

你好我想请问一下,训练集中需要添加负样本吗,我看论文中Table2统计信息的CodeX-S-Train数据集中大小为32888,而我实际打开CoDeX-S-train.json数据集发现不仅包含正样本32,888,还包含负样本131,522。我想请问在训练的时候数据集中需要添加负样本吗?

关于inductive的一点疑问

请问:1. 本文使用的inductive setting中,有一些unseen entities,请问这些unseen entities是指没有在LLM训练过程中出现的entities,还是没有在TransE和LLM训练的过程中出现的entities?
2. 如果两个·知识图谱没有经过对齐,分别对两个模型进行训练得到了两组TransE / RotatE模型M1和M2,而LLM微调使用M1生成的embedding,请问不重新训练的情况下泛化到M2上?

微调后推理的结果比论文中低很多FB15K-237N数据集+llama7B

使用FB15K-237N数据集,llama-7B模型进行微调,推理结果为:

0.5651592511548748 0.5417185554171855 0.8460977388767323 0.6605295624940685

image

微调参数如下,在A6000上运行,batch_size=4:

export WANDB_DISABLED=true
wandb offline
CUDA_VISIBLE_DEVICES=0 nohup python finetune_kopa.py \
    --base_model 'baffo32/decapoda-research-llama-7B-hf' \
    --data_path 'data/FB15K-237N-train.json' \
    --output_dir 'output/llama_fb_2' \
    --num_epochs 3 \
    --lora_r 64 \
    --learning_rate 3e-4 \
    --batch_size 4 \
    --micro_batch_size 4 \
    --num_prefix 1 \
    --kge_model 'data/FB15K-237N-rotate.pth' \
    --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' > log_llama_fb2.txt &

微调后,loss大小为'train_loss': 0.0794358075707171
训练后的lora权重文件也不是空的,大小如下:
image

推理代码

base_path = 'baffo32/decapoda-research-llama-7B-hf'
if __name__ == "__main__":
    cuda = "cuda:0"
    lora_weights = "output/llama_fb_2"
    test_data_path = "data/FB15K-237N-test.json"

请问可能是哪的问题导致了这样比较低的结果呢?

结果为0.5,0.0,0.0,0.0

你好,我没有更改任何代码,并且采用的是standford alpaca。 但是在两个数据集上的结果是0.5,0.0,0.0,0.0. 并且查看了lora 权重并不是空文件,请问该如何解决这个问题? 谢谢!!
image
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.