Code Monkey home page Code Monkey logo

Comments (5)

Jack-ZC8 avatar Jack-ZC8 commented on August 19, 2024 1

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

from self-rag.

Jack-ZC8 avatar Jack-ZC8 commented on August 19, 2024

It seems like I have the same question, I would appreciate it if there is any possible solution! @HazekiahWon @AkariAsai

from self-rag.

hummingbird2030 avatar hummingbird2030 commented on August 19, 2024

Hello , I encountered the same problem. I wonder how do you load the model. I have trained a model and get the same files, which have model.safetensors . I When I try to load the model to eval by running run_short_form.py , this problem occurs. I delete the model.safetensors as #21 suggested but I didn't solve the problem. @HazekiahWon @Jack-ZC8 @AkariAsai

File "run_short_form.py", line 302, in main
model = LLM(model=gpt, download_dir=args.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/entrypoints/llm.py", line 105, in init
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 250, in from_engine_args
engine = cls(*engine_configs,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 110, in init
self._init_workers(distributed_init_method)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 146, in _init_workers
self._run_workers(
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 755, in _run_workers
self._run_workers_in_batch(workers, method, *args, **kwargs))
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 729, in _run_workers_in_batch
output = executor(*args, **kwargs)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/worker.py", line 79, in load_model
self.model_runner.load_model()
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/model_runner.py", line 57, in load_model
self.model = get_model(self.model_config)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/model_loader.py", line 72, in get_model
model.load_weights(model_config.model, model_config.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/models/llama.py", line 340, in load_weights
weight_loader(param, loaded_weight)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 80, in weight_loader
assert loaded_weight.shape[parallel_dim] == self.num_embeddings
AssertionError

from self-rag.

hummingbird2030 avatar hummingbird2030 commented on August 19, 2024

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

Thanks, I solved this problem the same!

from self-rag.

fate-ubw avatar fate-ubw commented on August 19, 2024

@HazekiahWon Hello~HazekiahWon
Recently, I also trained selfrag-7b based on llama2. I encountered the same problem as you: using the training data and scripts provided by selfrag, I obtained selfrag-7b-myversion by fine-tuning llama2-7b. When I evaluated selfrag-7b-myversion, I found that its performance metrics were not as good as the officially released selfrag-7b. I saw that you provided the evaluation results of your fine-tuned model on the TQA dataset and the acc is 0.503. Could you please share the results of your model on the PopQA, Archallenge, and PubHealth datasets as well? Thank you very much, and I look forward to your reply.

from self-rag.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.