Thanks for your inspiring work <a class="user-mention notranslate" data-hovercard-type

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself about self-rag HOT 5 OPEN

HazekiahWon commented on August 19, 2024 2

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself

from self-rag.

Comments (5)

Jack-ZC8 commented on August 19, 2024 1

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

from self-rag.

Jack-ZC8 commented on August 19, 2024

It seems like I have the same question, I would appreciate it if there is any possible solution! @HazekiahWon @AkariAsai

from self-rag.

hummingbird2030 commented on August 19, 2024

Hello , I encountered the same problem. I wonder how do you load the model. I have trained a model and get the same files, which have model.safetensors . I When I try to load the model to eval by running run_short_form.py , this problem occurs. I delete the model.safetensors as #21 suggested but I didn't solve the problem. @HazekiahWon @Jack-ZC8 @AkariAsai

File "run_short_form.py", line 302, in main
model = LLM(model=gpt, download_dir=args.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/entrypoints/llm.py", line 105, in init
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 250, in from_engine_args
engine = cls(*engine_configs,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 110, in init
self._init_workers(distributed_init_method)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 146, in _init_workers
self._run_workers(
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 755, in _run_workers
self._run_workers_in_batch(workers, method, *args, **kwargs))
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 729, in _run_workers_in_batch
output = executor(*args, **kwargs)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/worker.py", line 79, in load_model
self.model_runner.load_model()
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/model_runner.py", line 57, in load_model
self.model = get_model(self.model_config)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/model_loader.py", line 72, in get_model
model.load_weights(model_config.model, model_config.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/models/llama.py", line 340, in load_weights
weight_loader(param, loaded_weight)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 80, in weight_loader
assert loaded_weight.shape[parallel_dim] == self.num_embeddings
AssertionError

from self-rag.

hummingbird2030 commented on August 19, 2024

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

Thanks, I solved this problem the same!

from self-rag.

fate-ubw commented on August 19, 2024

@HazekiahWon Hello～HazekiahWon
Recently, I also trained selfrag-7b based on llama2. I encountered the same problem as you: using the training data and scripts provided by selfrag, I obtained selfrag-7b-myversion by fine-tuning llama2-7b. When I evaluated selfrag-7b-myversion, I found that its performance metrics were not as good as the officially released selfrag-7b. I saw that you provided the evaluation results of your fine-tuned model on the TQA dataset and the acc is 0.503. Could you please share the results of your model on the PopQA, Archallenge, and PubHealth datasets as well? Thank you very much, and I look forward to your reply.

from self-rag.

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself about self-rag HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent