I was trying to run this demo code: <div class="snippet-clipboard-content notransl

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

ModuleNotFoundError: No module named 'neural_speed.mistral_cpp' about neural-speed HOT 7 CLOSED

intel commented on July 28, 2024

ModuleNotFoundError: No module named 'neural_speed.mistral_cpp'

from neural-speed.

Comments (7)

Zhenzhong1 commented on July 28, 2024 1

Hi, @santurini

ModuleNotFoundError: No module named 'neural_speed.mistral_cpp'

This means installation is not successful.

Please reinstall ITREX & NeuralSpeed from the source code. It works I have checked it.

git clone https://github.com/intel/intel-extension-for-transformers.git
pip install -r requirmenets.txt 
python setup.py install

git clone https://github.com/intel/neural-speed.git
pip install -r requirmenets.txt 
python setup.py install

Then, use this script

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM

#model_name = "Intel/neural-chat-7b-v3-1"     # Hugging Face model_id or local model
# git lfs install & git clone https://huggingface.co/Intel/neural-chat-7b-v3-1
model_name = "/home/zhenzhong/model/neural-chat-7b-v3-1"
prompt = "Once upon a time, there existed a little girl,"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
streamer = TextStreamer(tokenizer)

model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)

As shown, model_name has been changed to the local. Everything will be ok.

BTW, Online loading is a known issue for this model. I will fix this tomorrow and let you know.

Thanks.

from neural-speed.

Zhenzhong1 commented on July 28, 2024 1

Hi, online loading for Intel/neural-chat-7b-v3-1 should be ok now if you cherry-pick this #132

from neural-speed.

santurini commented on July 28, 2024

Can I ask on which OS are you working? I tried to reproduce your steps using Ubuntu-20.04 on WSL-2 but I have errors during the installation (python setup.py install) of ITREX

from neural-speed.

Zhenzhong1 commented on July 28, 2024

@santurini Hi

I am running ITREX & NeuralSpeed on Linux. Ubuntu-20.04 on WSL-2 should be OK I think.

Please check gxx. I am using gcc version 13.2.0 (conda-forge gcc 13.2.0-5)

conda install conda-forge::gxx

Make sure you have uninstalled all ITREX & NeuralSpeed by using pip uninstall neural-speed & pip uninstasll.

If you update the gxx, you also need delete build & neural_speed.egg-info directories before reinstallation

from neural-speed.

DDEle commented on July 28, 2024

In addition, if you are building from source and install the package (non-editable-installation), please check if #88 (comment) helps. If that doesn't work, showing the output of the following commands may help.

export ns_dir=$(python -c "import neural_speed; print(neural_speed.__path__[0])")
echo $ns_dir
ls $ns_dir

from neural-speed.

santurini commented on July 28, 2024

Sorry for the late response, after playing a bit with the environment I was able to compile everything correctly.
Should I open a new issue to ask about supported quantization methods?

When trying to load mistral-7b-instruct-v0.1.Q4_K_M.gguf, I get the following error:

error loading model: unrecognized tensor type 12

model_init_from_file: failed to load model
Segmentation fault

from neural-speed.

Zhenzhong1 commented on July 28, 2024

@santurini Hi, we don't support Qx_K_M.gguf / Qx_K_S.gguf currently. Please try q4_0.gguf and check this https://github.com/intel/neural-speed/blob/main/docs/supported_models.md

from neural-speed.

ModuleNotFoundError: No module named 'neural_speed.mistral_cpp' about neural-speed HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent