Comments (10)
in ingest.py replace the model with a smaller one
# Create embeddings
# instructor-xl gives out of memory error, use a smaller one, if you still get an error use -base
# larger model can give better results but is 'useless' if you can't load it
embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large",
model_kwargs={"device": device})
from localgpt.
It's also used in the run_localGPT.py file, so replace it there too
embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-large",
model_kwargs={"device": device})
from localgpt.
Aaaand run_LocalGPT.py also downloads the 7B Vicuna model TheBloke/vicuna-7B-1.1-HF which is too large as well.
So we'll probably need a smaller model for that one too, to prevent CUDA memory errors.
I've tried changing it to cerebras/Cerebras-GPT-2.7B which I know fits on my 3070 card, but that didn't work.
Maybe someone else has time to find a smaller model that does work with this script.
from localgpt.
Ok, if you run this on Linux. Try these steps. Looks like the Constitution needs 66GB by itself. FYI, this is very slow...
#Turn off all swap processes
sudo swapoff -a
#Resize the swap (from 512 MB to 100GB)
sudo dd if=/dev/zero of=/swapfile bs=1G count=100
#Make the file usable as swap
sudo mkswap /swapfile
#Activate the swap file
sudo swapon /swapfile
from localgpt.
NVIDIA GeForce Max250, 16GB memory
from localgpt.
I am also getting a runtime error after running run_localGPT.py:
"RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 180355072 bytes."
I have AMD Ryzen 9 5900X 12-core 3.7Ghz CPU, NVIDIA RTX 3070 GPU w/ 8GB VRAM, and 16GB RAM. Is more memory required to run localGPT?
I do have the repo saved to my external HD, whereas I noticed the LLM is saved on my C drive. Would this cause the surge in memory usage? If so, is it possible to store the models locally in the repo, similar to privateGPT?
Thanks in advance.
from localgpt.
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 16777216 bytes.
I keep getting the same response.
from localgpt.
I have 4Gb and I'm running out of memory. If we can get it to run on my old NVIDIA then it will run anywhere.
from localgpt.
It result?
from localgpt.
Yes, it's working now, very slow.
> Question:
summarize this document in one sentense
> Answer:
This is the Constitution of the United States, which outlines the basic structure and function of the government of the country.
from localgpt.
Related Issues (20)
- Why its sharing questions and data from different browser sessions?
- How do I add memory to chat-zero-shot-react-description?
- Autoawq HOT 2
- Mistral not supported HOT 2
- cpp-llama-python not found. HOT 1
- problem when ingesting (just CPU) HOT 1
- auto-gptq and auto awq is breaking in requiremetns.txt HOT 1
- I encountered a mistake when I executed run_localGPT_API.py HOT 1
- Extra Options with run_localGPT_API.py?
- error in /opt/nvidia/nvidia_entrypoint.sh HOT 1
- run_localGPT_API HOT 8
- Support llama-3 HOT 10
- If I want to improve the Recall access the reranker model ,how can I do it?
- llama3 support: ERROR: byte not found in vocab, segmentation fault HOT 1
- Unable to execute 'python run_localGPT.py --device_type cpu'
- Can I reuse the models which I have running locally via ollama service ?
- Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. HOT 1
- How to authenticate to huggingface.co, from the run_localGPT.py script, using Docker? HOT 1
- Cannot access gated repo HOT 2
- Improved metadata at ingest
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from localgpt.