Code Monkey home page Code Monkey logo

Comments (10)

benninkcorien avatar benninkcorien commented on June 10, 2024 3

in ingest.py replace the model with a smaller one

# Create embeddings
# instructor-xl gives out of memory error, use a smaller one, if you still get an error use -base
# larger model can give better results but is 'useless' if you can't load it

embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large",
                                            model_kwargs={"device": device})

from localgpt.

benninkcorien avatar benninkcorien commented on June 10, 2024 1

It's also used in the run_localGPT.py file, so replace it there too

    embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large",
                                            model_kwargs={"device": device})

from localgpt.

benninkcorien avatar benninkcorien commented on June 10, 2024 1

Aaaand run_LocalGPT.py also downloads the 7B Vicuna model TheBloke/vicuna-7B-1.1-HF which is too large as well.
So we'll probably need a smaller model for that one too, to prevent CUDA memory errors.

I've tried changing it to cerebras/Cerebras-GPT-2.7B which I know fits on my 3070 card, but that didn't work.
Maybe someone else has time to find a smaller model that does work with this script.

from localgpt.

NQevxvEtg avatar NQevxvEtg commented on June 10, 2024 1

Ok, if you run this on Linux. Try these steps. Looks like the Constitution needs 66GB by itself. FYI, this is very slow...

#Turn off all swap processes

sudo swapoff -a

#Resize the swap (from 512 MB to 100GB)

sudo dd if=/dev/zero of=/swapfile bs=1G count=100

#Make the file usable as swap

sudo mkswap /swapfile

#Activate the swap file

sudo swapon /swapfile

from localgpt.

georgeqin96 avatar georgeqin96 commented on June 10, 2024

NVIDIA GeForce Max250, 16GB memory

from localgpt.

fjsikora avatar fjsikora commented on June 10, 2024

I am also getting a runtime error after running run_localGPT.py:

"RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 180355072 bytes."

I have AMD Ryzen 9 5900X 12-core 3.7Ghz CPU, NVIDIA RTX 3070 GPU w/ 8GB VRAM, and 16GB RAM. Is more memory required to run localGPT?

I do have the repo saved to my external HD, whereas I noticed the LLM is saved on my C drive. Would this cause the surge in memory usage? If so, is it possible to store the models locally in the repo, similar to privateGPT?

Thanks in advance.

from localgpt.

Hermit07 avatar Hermit07 commented on June 10, 2024

RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 16777216 bytes.

I keep getting the same response.

from localgpt.

msoler75 avatar msoler75 commented on June 10, 2024

I have 4Gb and I'm running out of memory. If we can get it to run on my old NVIDIA then it will run anywhere.

from localgpt.

zoomspoon1 avatar zoomspoon1 commented on June 10, 2024

It result?

from localgpt.

NQevxvEtg avatar NQevxvEtg commented on June 10, 2024

Yes, it's working now, very slow.

> Question:
summarize this document in one sentense

> Answer:
 This is the Constitution of the United States, which outlines the basic structure and function of the government of the country.

from localgpt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.