RuntimeError :out of memory about localgpt HOT 10 OPEN

georgeqin96 commented on June 10, 2024

RuntimeError :out of memory

from localgpt.

Comments (10)

benninkcorien commented on June 10, 2024 3

in ingest.py replace the model with a smaller one

# Create embeddings
# instructor-xl gives out of memory error, use a smaller one, if you still get an error use -base
# larger model can give better results but is 'useless' if you can't load it

embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large",
                                            model_kwargs={"device": device})

from localgpt.

benninkcorien commented on June 10, 2024 1

It's also used in the run_localGPT.py file, so replace it there too

    embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large",
                                            model_kwargs={"device": device})

from localgpt.

benninkcorien commented on June 10, 2024 1

Aaaand run_LocalGPT.py also downloads the 7B Vicuna model TheBloke/vicuna-7B-1.1-HF which is too large as well.
So we'll probably need a smaller model for that one too, to prevent CUDA memory errors.

I've tried changing it to cerebras/Cerebras-GPT-2.7B which I know fits on my 3070 card, but that didn't work.
Maybe someone else has time to find a smaller model that does work with this script.

from localgpt.

NQevxvEtg commented on June 10, 2024 1

Ok, if you run this on Linux. Try these steps. Looks like the Constitution needs 66GB by itself. FYI, this is very slow...

#Turn off all swap processes

sudo swapoff -a

#Resize the swap (from 512 MB to 100GB)

sudo dd if=/dev/zero of=/swapfile bs=1G count=100

#Make the file usable as swap

sudo mkswap /swapfile

#Activate the swap file

sudo swapon /swapfile

from localgpt.

georgeqin96 commented on June 10, 2024

NVIDIA GeForce Max250, 16GB memory

from localgpt.

fjsikora commented on June 10, 2024

I am also getting a runtime error after running run_localGPT.py:

"RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 180355072 bytes."

I have AMD Ryzen 9 5900X 12-core 3.7Ghz CPU, NVIDIA RTX 3070 GPU w/ 8GB VRAM, and 16GB RAM. Is more memory required to run localGPT?

I do have the repo saved to my external HD, whereas I noticed the LLM is saved on my C drive. Would this cause the surge in memory usage? If so, is it possible to store the models locally in the repo, similar to privateGPT?

Thanks in advance.

from localgpt.

Hermit07 commented on June 10, 2024

RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 16777216 bytes.

I keep getting the same response.

from localgpt.

msoler75 commented on June 10, 2024

I have 4Gb and I'm running out of memory. If we can get it to run on my old NVIDIA then it will run anywhere.

from localgpt.

zoomspoon1 commented on June 10, 2024

It result?

from localgpt.

NQevxvEtg commented on June 10, 2024

Yes, it's working now, very slow.

> Question:
summarize this document in one sentense

> Answer:
 This is the Constitution of the United States, which outlines the basic structure and function of the government of the country.

from localgpt.

RuntimeError :out of memory about localgpt HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent