Code Monkey home page Code Monkey logo

Comments (9)

juncongmoo avatar juncongmoo commented on August 28, 2024 3

Hello, i am using a Nvidia Gtx1650 4GB GPU. Are there any way to run the 7B model on it?

Sure. I am still working on it... so that we can run 13B on a 4GB GPU.

from pyllama.

SWHL avatar SWHL commented on August 28, 2024 1

I changed the max_seq_len from 1024 to 512, and successfully inference the result in the RAM 16GB.

max_seq_len=1024,

from pyllama.

SWHL avatar SWHL commented on August 28, 2024 1
  • I simply sorted out the code of the repo, and only kept the two simplest use cases.
  • Hope it can help everyone.
  • LLaMADemo

from pyllama.

Starlento avatar Starlento commented on August 28, 2024

For me, the inference.py use 15.7GB VRAM with everything by default, though I am using Windows WSL2 which means the VRAM usage of the script should be roughly 14GB... Seems still not enough for 12GB VRAM.

from pyllama.

m-GDEV avatar m-GDEV commented on August 28, 2024

Oh ok, yeah that would make sense. Do you think there might be a way to run the script in a way the uses less VRAM?

from pyllama.

brandonrobertz avatar brandonrobertz commented on August 28, 2024

You can try the 8bit (less precision) model: https://github.com/tloen/llama-int8

from pyllama.

Starlento avatar Starlento commented on August 28, 2024

I found https://github.com/qwopqwop200/GPTQ-for-LLaMa which turn it to 4-bit, I can run the benchmark in that repo. But the code in that repo is using huggingface transformer and it seems that at least the model loading is different.

from pyllama.

vo2021 avatar vo2021 commented on August 28, 2024

I changed the max_seq_len from 1024 to 512, and successfully inference the result in the RAM 16GB.

max_seq_len=1024,

Thanks for the tips!

from pyllama.

DigitalLawyerLCB avatar DigitalLawyerLCB commented on August 28, 2024

Hello, i am using a Nvidia Gtx1650 4GB GPU. Are there any way to run the 7B model on it?

from pyllama.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.