Code Monkey home page Code Monkey logo

Comments (7)

Zhenzhong1 avatar Zhenzhong1 commented on July 28, 2024 1

Hi, @santurini

ModuleNotFoundError: No module named 'neural_speed.mistral_cpp'

This means installation is not successful.

Please reinstall ITREX & NeuralSpeed from the source code. It works I have checked it.

git clone https://github.com/intel/intel-extension-for-transformers.git
pip install -r requirmenets.txt 
python setup.py install

git clone https://github.com/intel/neural-speed.git
pip install -r requirmenets.txt 
python setup.py install

Then, use this script

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM

#model_name = "Intel/neural-chat-7b-v3-1"     # Hugging Face model_id or local model
# git lfs install & git clone https://huggingface.co/Intel/neural-chat-7b-v3-1
model_name = "/home/zhenzhong/model/neural-chat-7b-v3-1"
prompt = "Once upon a time, there existed a little girl,"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
streamer = TextStreamer(tokenizer)

model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)

As shown, model_name has been changed to the local. Everything will be ok.
image

BTW, Online loading is a known issue for this model. I will fix this tomorrow and let you know.

Thanks.

from neural-speed.

Zhenzhong1 avatar Zhenzhong1 commented on July 28, 2024 1

Hi, online loading for Intel/neural-chat-7b-v3-1 should be ok now if you cherry-pick this #132

from neural-speed.

santurini avatar santurini commented on July 28, 2024

Can I ask on which OS are you working? I tried to reproduce your steps using Ubuntu-20.04 on WSL-2 but I have errors during the installation (python setup.py install) of ITREX

from neural-speed.

Zhenzhong1 avatar Zhenzhong1 commented on July 28, 2024

@santurini Hi

I am running ITREX & NeuralSpeed on Linux. Ubuntu-20.04 on WSL-2 should be OK I think.

Please check gxx. I am using gcc version 13.2.0 (conda-forge gcc 13.2.0-5)

conda install conda-forge::gxx

image

Make sure you have uninstalled all ITREX & NeuralSpeed by using pip uninstall neural-speed & pip uninstasll.

If you update the gxx, you also need delete build & neural_speed.egg-info directories before reinstallation

from neural-speed.

DDEle avatar DDEle commented on July 28, 2024

In addition, if you are building from source and install the package (non-editable-installation), please check if #88 (comment) helps. If that doesn't work, showing the output of the following commands may help.

export ns_dir=$(python -c "import neural_speed; print(neural_speed.__path__[0])")
echo $ns_dir
ls $ns_dir

from neural-speed.

santurini avatar santurini commented on July 28, 2024

Sorry for the late response, after playing a bit with the environment I was able to compile everything correctly.
Should I open a new issue to ask about supported quantization methods?

When trying to load mistral-7b-instruct-v0.1.Q4_K_M.gguf, I get the following error:

error loading model: unrecognized tensor type 12

model_init_from_file: failed to load model
Segmentation fault

from neural-speed.

Zhenzhong1 avatar Zhenzhong1 commented on July 28, 2024

@santurini Hi, we don't support Qx_K_M.gguf / Qx_K_S.gguf currently. Please try q4_0.gguf and check this https://github.com/intel/neural-speed/blob/main/docs/supported_models.md

from neural-speed.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.