Code Monkey home page Code Monkey logo

minigpt4.cpp's Introduction

minigpt4.cpp

Quickstart in Colab

Inference of MiniGPT4 in pure C/C++.

Description

The main goal of minigpt4.cpp is to run minigpt4 using 4-bit quantization with using the ggml library.

Demo

minigpt1

minigpt1

Usage

1. Clone repo

Requirements: git

git clone --recursive https://github.com/Maknee/minigpt4.cpp
cd minigpt4.cpp

2. Getting the library

Option 1: Download precompiled binary

Windows / Linux / MacOS

Go to Releases and extract minigpt4 library file into the repository directory.

Option 2: Build library manually

Windows

Requirements: CMake, Visual Studio and Git

cmake .
cmake --build . --config Release

bin\Release\minigpt4.dll should be generated

Linux

Requirements: CMake (Ubuntu: sudo apt install cmake)

cmake .
cmake --build . --config Release

minigpt4.so should be generated

MacOS

Requirements: CMake (MacOS: brew install cmake)

cmake .
cmake --build . --config Release

minigpt4.dylib should be generated

Note: If you build with opencv (allowing features such as loading and preprocessing image within the library itself), set MINIGPT4_BUILD_WITH_OPENCV to ON in CMakeLists.txt or build with -DMINIGPT4_BUILD_WITH_OPENCV=ON as a parameter to the cmake cli.

3. Obtaining the model

Option 1: Download pre-quantized MiniGPT4 model

Pre-quantized models are available on Hugging Face ~ 7B or 13B.

Recommended for reliable results, but slow inference speed: minigpt4-13B-f16.bin

Option 2: Convert and quantize PyTorch model

Requirements: Python 3.x and PyTorch.

Clone the MiniGPT-4 repository and perform the setup

cd minigpt4
git clone https://github.com/Vision-CAIR/MiniGPT-4.git
cd MiniGPT-4
conda env create -f environment.yml
conda activate minigpt4

Download the pretrained checkpoint in the MiniGPT-4 repository under Checkpoint Aligned with Vicuna 7B or Checkpoint Aligned with Vicuna 13B or download them from Huggingface link for 7B or 13B

Convert the model weights into ggml format

Windows

7B model

cd minigpt4
python convert.py C:\pretrained_minigpt4_7b.pth --ftype=f16

13B model

cd minigpt4
python convert.py C:\pretrained_minigpt4.pth --ftype=f16
Linux / MacOS

7B model

python convert.py ~/Downloads/pretrained_minigpt4_7b.pth --outtype f16

13B model

python convert.py ~/Downloads/pretrained_minigpt4.pth --outtype f16

minigpt4-7B-f16.bin or minigpt4-13B-f16.bin should be generated

4. Obtaining the vicuna model

Option 1: Download pre-quantized vicuna-v0 model

Pre-quantized models are available on Hugging Face

Recommended for reliable results and decent inference speed: ggml-vicuna-13B-v0-q5_k.bin

Option 2: Convert and quantize vicuna-v0 model

Requirements: Python 3.x and PyTorch.

Follow the guide from the MiniGPT4 to obtain the vicuna-v0 model.

Then, clone llama.cpp

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake .
cmake --build . --config Release

Convert the model to ggml

python convert.py <path-to-model>

Quantize the model

python quanitize <path-to-model> <output-model> Q4_1

5. Running

Test if minigpt4 works by calling the following, replacing minigpt4-13B-f16.bin and ggml-vicuna-13B-v0-q5_k.bin with your respective models

cd minigpt4
python minigpt4_library.py minigpt4-13B-f16.bin ggml-vicuna-13B-v0-q5_k.bin
Webui

Install the requirements for the webui

pip install -r requirements.txt

Then, run the webui, replacing minigpt4-13B-f16.bin and ggml-vicuna-13B-v0-q5_k.bin with your respective models

python webui.py minigpt4-13B-f16.bin ggml-vicuna-13B-v0-q5_k.bin

The output should contain something like the following:

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

Go to http://127.0.0.1:7860 in your browser and you should be able to interact with the webui.

minigpt4.cpp's People

Contributors

erjanmx avatar felladrin avatar hayasaka-ryosuke avatar maknee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

minigpt4.cpp's Issues

CMake build fails for Ubuntu 20.04.6 LTS

Trying to build following the instructions of the README. I'm on Ubuntu 20.04.6 LTS.

I first cloned the repo:

git clone --recursive https://github.com/Maknee/minigpt4.cpp
cd minigpt4.cpp

I then installed cmake version 3.16.3-1ubuntu1.20.04.1 with
sudo apt install cmake.

Then I ran cmake ., but it fails:

cmake .
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Working on fmt
-- Module support is disabled.
-- Version: 9.1.0
-- Build type: 
-- CXX_STANDARD: 23
-- Required features: cxx_variadic_templates
-- Working on unordered_dense
-- Working on stb
	Header only
-- Working on spdlog
-- Build spdlog: 1.11.0
-- Build type: Release
-- Working on nlohmann_json
-- Using the multi-header code from /home/marnix/ART/CAMERA/minigpt4.cpp/_deps/nlohmann_json-src/include/
-- Working on tl_expected
-- Working on llama_cpp
CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:40 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_STATIC'.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:41 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_NATIVE'.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:42 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_LTO'.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:55 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_AVX'.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:56 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_AVX2'.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:57 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_AVX512'.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:60 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_FMA'.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) at _deps/llama_cpp-src/CMakeLists.txt:67 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'LLAMA_ACCELERATE'.
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found Git: /usr/bin/git (found version "2.25.1") 
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Working on magic_enum
-- Configuring done
CMake Error in CMakeLists.txt:
  The CXX_STANDARD property on target "minigpt4" contained an invalid value:
  "23".


CMake Generate step failed.  Build files cannot be regenerated correctly.

Any idea what is going on?

Is building from source necessary to get optimal performance for my system's architecture?

How to accelerate inference?

Hi,

I enabled the cublas compilation option.

The problem is that not charge o process all in GRAM memory?

What is the best line command to construct and execute in a CUDA 3090 with 24GB GRAM in the more fast posibility for each model?

Generate a bin file in Linux.

In the "Convert the model to ggml" section of the README.md, what is the "" parameter referring to? Following the instructions in MiniGPT4, I ended up with a folder containing the model weights. What specific file should I point to with this parameter?

two questions

Hello!

thank you for this.

is there any chance getting a gui without needing to install python like koboldcpp in the future for its portability?

also, can we use this with other models like wizardlm too?

kind regards

how to tokennizer vicuna

Hi ๏ผŒ
I want to test the token speed of minigpt4, but tokenizer failed
AutoTokenizer.from_pretrained('maknee/ggml-vicuna-v0-quantized/13B') or

huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'maknee/ggml-vicuna-v0-quantized/13B'. Use repo_type argument if needed.

AutoTokenizer.from_pretrained('maknee/ggml-vicuna-v0-quantized') both failed.

Repository Not Found for url: https://huggingface.co/maknee/ggml-vicuna-v0-quantized/resolve/main/tokenizer_config.json.
Please make sure you specified the correct repo_id and repo_type.

what is correct command for tokennizer? thanks

MiniGPT-v2 support

Posting to gauge/express interested in MiniGPT-v2 support being added.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.