nomic-ai / gpt4all Goto Github PK

gpt4all: run open-source LLMs anywhere

License: MIT License

Python 9.19% Shell 0.31% CMake 5.08% C++ 53.41% Qt Script 0.23% C 1.10% QML 18.49% CSS 0.02% Makefile 0.43% Go 0.68% JavaScript 4.00% C# 3.38% PowerShell 0.12% Java 3.51% Batchfile 0.06%

llm-inference

gpt4all's Issues

Gets stuck in infinte loops when requested to generate code

I used ./gpt4all-lora-quantized-OSX-m1

> Give me a C++ algorithm for the integer square root (isqrt).
The following is an implementation of iSqrt() function which returns the integer Square Root of its argument, in terms of unsigned long ints on 32-bit systems. It uses Newton's method to find a better approximation as it iterates through each iteration until convergence or exceeding maximum number of iterations allowed (10).
```cpp
#include <iostream>
using namespace std; // for cout << endl;
unsigned long int iSqrt( unsigned long int n ) {
    if (n <= 2) return n *= 3/2 + sqrt((double)(n*n)); else{
        double x = pow(-1.0, floor(.5*(log(fabs(n)))+1./4))/(sqrt(6)*pow(x,3)/pi); // Newton's method for integer square root calculation;
    unsigned long int result= (unsigned)floor((double)(x)); 	// convert double to signed integer.
        if ((result-fabs(n/2+1))) { x = pow(-1.0,( floor(.5*(log(fabs(x)))) + .75))/(sqrt(6)*pow(x,3)/pi); } // Newton's method for double square root calculation;
        if (result-floor((double)(n/2+1))) { x = pow(-1.0,( floor(.5*(log(fabs(x)))) + .75))/(sqrt(6)*pow(x,3)/pi); } // Newton's method for double square root calculation;
        if (result-floor((double)(n/2+1))) { x = pow(-1.0,( floor(.5*(log(fabs(x)))) + .75))/(sqrt(6)*pow(x,3)/pi); } // Newton's method for double square root calculation;
 ... keeps repeating the last line until stopped ...

Another session:

> Give me a C++ algorithm for the integer square root (isqrt).
Here's one way to implement iSqrt in C++, using an array of integers and dynamic programming techniques:
```cpp
int sqr(const int& num) { // returns squared value, not just a number
    constexpr auto MAX = (std::numeric_limits<decltype>::max)*2;// max possible square root for any integer <= this limit

    if (num < 0 || std::abs(int64_t{1}) == num) { // handle special cases like -ve numbers and 1 itself
        return sqr(-std::fmod<decltype>(-std::floor, int64_t{2}, MAX));//return floor of the square root for negative number or just take it as is if positive.
    } else { // handle normal cases like nonnegative numbers > 1 (iSqrt)
        constexpr auto sqr = std::pow(std::fabs<decltype>(num),2);//return squared value of the input integer number for isqrt calculation

        if ((int64_t{0} == num && int64_t{1}) || (sqr <= MAX)) { //handle special cases like 0 and sqrt(1) which are both irrational numbers.
            return std::fmod<decltype>(-std::floor, sqr);//return floor of the square root for these two values as they have no rational value to represent them in iSqrt calculation.
        } else { //handle normal cases like nonnegative integers > 1 (iSquare)
            constexpr auto num2 = std::fabs<decltype>(num);//return absolute value of the input integer number for sqr(x).

            if ((sqr <= MAX && int64_t{0} == num || (int64_t{1}) > 3) //handle special cases like sqrt(2), which is irrational and has no rational value to represent it in iSqrt calculation.
                return std::fmod<decltype>(-std::floor, int64_t{num/int64_t{1}});//return floor of the square root for these two values as they have irrational numbers and no rational value to represent them in iSqrt calculation.
            else { //handle normal cases like nonnegative integers > 2 (iSquare)
                constexpr auto num3 = std::fabs<decltype>(num);//return absolute value of the input integer number for sqr(x).

                if ((sqr <= MAX && int64_t{0} == num || (int64_t{1}) > 2) //handle special cases like sqrt(3), which is irrational and has no rational value to represent it in iSqrt calculation.
                    return std::fmod<decltype>(-std::floor, int64_t{num/int64_t{1}});//return floor of the square root for these two values as they have irrational numbers and no rational value to represent them in iSqrt calculation.
                else { //handle normal cases like nonnegative integers > 3 (iSquare)
                    constexpr auto num4 = std::fabs<decltype>(num);//return absolute value of the input integer number for sqr(x).

Distributed package doesn't have NCCL?

Hi y'all, thank you for releasing this work!

I was trying to re-run the training using the base finetuning.yml (+1 on #30 btw) but got this error below, did I skip a step or do something wrong?

RuntimeError    : raise RuntimeError("Distributed package doesn't have NCCL " "built in")Distributed package doesn't have NCCL built in

RuntimeError: Distributed package doesn't have NCCL built in
    torch.distributed.init_process_group(backend="nccl", **kwargs)
  File "/Users/ericnograles/Library/Python/3.9/lib/python/site-packages/torch/distributed/distributed_c10d.py", line 895, in init_process_group
    PartialState(cpu, **kwargs)
    raise RuntimeError("Distributed package doesn't have NCCL " "built in")
  File "/Users/ericnograles/Library/Python/3.9/lib/python/site-packages/accelerate/state.py", line 117, in __init__
RuntimeError: Distributed package doesn't have NCCL built in
    self.state = AcceleratorState(

Error DeepSpeed is not installed => run `pip3 install deepspeed` or build it from source when trying to run accelerate launch

And when I try to install deepspeed it throws:

error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [15 lines of output]
      test.c
      LINK : fatal error LNK1181: no se puede abrir el archivo de entrada 'aio.lib'
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "C:\Users\fer\AppData\Local\Temp\pip-install-rl2m083a\deepspeed_0a392c06e34c405d970b33e0372ff5a3\setup.py", line 156, in <module>
          abort(f"Unable to pre-compile {op_name}")
        File "C:\Users\fer\AppData\Local\Temp\pip-install-rl2m083a\deepspeed_0a392c06e34c405d970b33e0372ff5a3\setup.py", line 48, in abort
          assert False, msg
      AssertionError: Unable to pre-compile async_io
      DS_BUILD_OPS=1
       [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
       [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
       [WARNING]  One can disable async_io with DS_BUILD_AIO=0
       [ERROR]  Unable to pre-compile async_io
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

This is happening on windows 11

Finetuning Interface: How to train for custom data?

I have a data set I want to train or fine tune on my data set. So how I can do this ?

Mac/Intel Chip

Do you have also an executable for Mac/Intel Chip?

TypeError: len() of a 0-d tensor

The training dataset (gpt4all_curated_data_without_p3_2022_03_27.tar.gz) has an issue with an empty prompt

    raise TypeError("len() of a 0-d tensor")                                                                                                                                                                                                                  TypeError: len() of a 0-d tensor

It's a kind of minor problem but you can't train with it because it crashes. Simple fix to trim it out:

egrep -v 'prompt": ""' data.jsonl

I'd submit a pull request but it's not data in the repo.

Incorrect run command

The executable appear to require the model to be in the current directory:

(venv) ➜  gpt4all git:(main) ✗ ./chat/gpt4all-lora-quantized-OSX-m1
main: seed = 1680048739
llama_model_load: loading model from 'gpt4all-lora-quantized.bin' - please wait ...
llama_model_load: failed to open 'gpt4all-lora-quantized.bin'
main: failed to load model from 'gpt4all-lora-quantized.bin'

cleaner gpu interface

make it very easy for people to get up and running with the model on gpu

Generations contains Instructions

Hi, great work!

Btw I noticed sometimes it generates instructions by itself. Is this because of the dataset format used for training?

Beginner question: What setting of GPU are you using?

How many GPUs?

Thank you very much!

![image](https://user-images.githubusercontent.com/90372299/228519900-65ca8ae2-9c94-48e8-8b86-56444156bb63.png)

I tryna to load model,but failed.It still said that my computer cannot open the bin s file , how can I solve it.I'll be appreciate if u help me!

Hi, I am trying to build a Flask API version of generate.py. Before that, I tried to run it on my server and encountered an error, which is probably linked to the default YAML file distributed.

model/tokenizer

model_name: # REPLACE HERE with the base llama model
tokenizer_name: # REPLACE HERE with the llama tokenizer
lora: true
lora_path: "/nomic-ai/gpt4all-lora"

max_new_tokens: 512
temperature: 0
prompt: null

The script generates this error :

╭─────────────────────────── Traceback (most recent call last) ────────────────────────────╮
│ /Users/michel/micromamba/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py:23 │
│ 9 in hf_raise_for_status │
│ │
│ 236 │ │
│ 237 │ """ │
│ 238 │ try: │
│ ❱ 239 │ │ response.raise_for_status() │
│ 240 │ except HTTPError as e: │
│ 241 │ │ error_code = response.headers.get("X-Error-Code") │
│ 242 │
│ │
│ /Users/michel/micromamba/lib/python3.9/site-packages/requests/models.py:1021 in │
│ raise_for_status │
│ │
│ 1018 │ │ │ ) │
│ 1019 │ │ │
│ 1020 │ │ if http_error_msg: │
│ ❱ 1021 │ │ │ raise HTTPError(http_error_msg, response=self) │
│ 1022 │ │
│ 1023 │ def close(self): │
│ 1024 │ │ """Releases the connection back to the pool. Once this method has been │
╰──────────────────────────────────────────────────────────────────────────────────────────╯
HTTPError: 404 Client Error: Not Found for url:
https://huggingface.co/gpt4all-lora/resolve/main/config.json

Can you please help me?

By the way, gpt4all-lora-quantized.bin is perfectly working using ./gpt4all-lora-quantized-OSX-m1

Stuck in an infinite loop

I followed the README and downloaded the bin file, copied it into the chat folder and ran ./gpt4all-lora-quantized-linux-x86.
Then started asking questions. After a few questions I asked for a joke and it has been stuck in a loop repeating the same lines over and over (maybe that's the joke! it's making fun of me!).

I can share the seed and the exact questions that I asked if that would help.

How many GPU hours were used to train the model?

My initial estimates from TRAINING_LOG.md and Meta's research paper:

Test run: either >0.05 or >59 GPU-hours
Full model: either >0.13 or >135 GPU-hours

I want to determine how costly it is to fine-tune the 7B model for local execution. My GPU trains models ~10x slower than one A100.

crash when prompt is longer than 256 characters

On Windows 11, use the prompt:

Create a detailed step by step plan for a kichen robot to make toast. Each step should be an individual action in machine readable format. The available actions: Grab(object), Move(object, next_to), Drop(object), Push(button), Cut(object, number_of_pieces).

Debugger output:

(5620.c8c): Security check failure or stack buffer overrun - code c0000409 (!!! second chance !!!)
Subcode: 0x8 FAST_FAIL_RANGE_CHECK_FAILURE 
gpt4all_lora_quantized_win64+0x23361:
00007ff6`7dfe3361 cd29            int     29h
0:000> k
 # Child-SP          RetAddr               Call Site
00 00000048`2b2fe2d0 00007ff6`7dfe333e     gpt4all_lora_quantized_win64+0x23361
01 00000048`2b2fe300 00007ff6`7dfcb760     gpt4all_lora_quantized_win64+0x2333e
02 00000048`2b2fe330 00007ff6`7dfe3adc     gpt4all_lora_quantized_win64+0xb760
03 00000048`2b2ffb30 00007ff8`0f923db1     gpt4all_lora_quantized_win64+0x23adc
04 00000048`2b2ffb70 00007ff8`101f32a1     KERNEL32!BaseThreadInitThunk+0x21
05 00000048`2b2ffba0 00000000`00000000     ntdll!RtlUserThreadStart+0x21

cannot rerun the model

I tried to rerun the model (it worked fine at the first time)
and i got this error:

main: seed = ****76542
llama_model_load: loading model from 'gpt4all-lora-quantized.bin' - please wait ...
llama_model_load: failed to open 'gpt4all-lora-quantized.bin'
main: failed to load model from 'gpt4all-lora-quantized.bin'

More cost effective hosting of pretrained model

Hi, just wanted to say that the quantised model in the README is using S3 a 4GB download at 9c per GB egress means that you (or nomic) are getting charged $0.36 per download. If you get 1000 downloads thats around $360USD.

Perhaps it might be worthwhile looking at alternative hosting like Cloudflare R2 (free egress) or using a mirror link / torrents (free)

https://s3.amazonaws.com/static.nomic.ai/gpt4all/models/gpt4all-lora-quantized.bin

Is there any chance to use it as a JavaScript library?

As a JS developer I’d like to use this project at my personal open sourced library. But unfortunately I don’t speak Python and have no idea how to use and change it. Is there any chance that the project will provide WebSocket or at least REST API?

Source for the chat binaries?

I looked around but didn't see it, where is the source for the chat interface?

git clone permission denied

$ git clone --recurse-submodules [email protected]:nomic-ai/gpt4all.git
...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

But this worked OK:
$ git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git
Cloning into 'gpt4all'...
remote: Enumerating objects: 315, done.
...
Submodule path 'transformers': checked out 'cae78c46d658a8e496a815c2ee49b9b178fb9c9a'

Verborrea answering simple coding request

write x86 assembler code compilable in MASM to make an "hello world" program
The response is the following assembly language source file that can be compiled using MASM (Microsoft Macro Assembler) and executed on a 32-bit Windows system. This example shows how to print out “Hello World” in Assembly Language, which demonstrates basic programming concepts such as input/output operations:

    .model flat, prop=0x18c
    
    section         = "hello"      ; Section name for the code segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "data"        ; Section name for the data segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "bss"        ; Section name for the bss segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "heap"        ; Section name for the heap segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "stack"        ; Section name for the stack segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "text"        ; Section name for the code segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "data"        ; Section name for the data segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "bss"        ; Section name for the bss segment
    flags            = secsig32   ; Flags that indicate this is a 64-bit executable (segsize/segvsize)
                          | segalign, nopsz=1024       // Segment alignment and size in bytes.
    
section         = "heap"        ; Section name for the heap segment
    flags            = secsig32   ; Flags that indicate this is a 64-^C

Results are worse than ggml-alpaca-7b-native-q4.bin

Tested controversial topics (for example, instructions for murdering or torturing) and gpt4all-lora-quantized.bin seems to mimic much of ChatGPT's excuses for not answering on such questions. While ggml-alpaca-7b-native-q4.bin is answering happily and on-topic.

Also consider increasing scanf buffer: antimatter15/alpaca.cpp#119

Illegal instruction (core dumped) on Linux Virtual Machine (KVM)

user@gpt4:~/gpt4all/chat$ ./gpt4all-lora-quantized-linux-x86 
main: seed = 1680120667
llama_model_load: loading model from 'gpt4all-lora-quantized.bin' - please wait ...
Illegal instruction (core dumped)

dmesg shows:

[  104.211520] systemd[1]: systemd 249.11-0ubuntu3.7 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[  104.211578] systemd[1]: Detected virtualization kvm.
[  104.211582] systemd[1]: Detected architecture x86-64.
[ 5620.273116] show_signal: 22 callbacks suppressed
[ 5620.273119] traps: gpt4all-lora-qu[17654] trap invalid opcode ip:423d62 sp:7ffe451f4828 error:0 in gpt4all-lora-quantized-linux-x86[400000+55000]
[ 5647.501626] traps: gpt4all-lora-qu[17668] trap invalid opcode ip:423d62 sp:7fffdfc29678 error:0 in gpt4all-lora-quantized-linux-x86[400000+55000]

strace tail shows:

...
loading libs, reading gpt4all-lora-quantized.bin
...
brk(0x13d5000)                          = 0x13d5000
brk(0x13f6000)                          = 0x13f6000
read(3, "\0\0\340\245\244\2\0\0\0\321\220\3\0\0\0\341\276\266\3\0\0\0\342\236\226\3\0\0\0\345\272\247"..., 8191) = 8191
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x423d62} ---
+++ killed by SIGILL (core dumped) +++
Illegal instruction (core dumped)

ILL_ILLOPN = Illegal operand. I suppose some CPU instruction is not available.

The CPU is AMD Epyc 7313, running Ubuntu 22.04 inside of a VM.

From the VM, the following cpu flags are enabled:

    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm rep_good nopl cpuid extd_apicid tsc_known_freq pni
                          cx16 x2apic hypervisor cmp_legacy 3dnowprefetch vmmcall
Virtualization features: 
  Hypervisor vendor:     KVM
  Virtualization type:   full

Unfortunately I'm not very experienced with VM's, however I would like to run GPT chat on a server.

Is it possible to get the source of gpt4all-lora-quantized-linux-x86 to recompile?

Is this a typo?

git submodule configure

Example is not valid iambic pentameter

Couple of different errors

First it does not seem to want to load the .bin model from the link

so i thought okay lets atleast see if the generate function works
no dice there either

its asking for a hugging face login....which was not part of the instructions - eager to see how this compares to base llama which is quite fun to play with and fine tune.

Setting up llama models for generate

I can't seem to be able to find llama model from HF that works with generate.py. The closest I got was this error:

Setting up model
Traceback (most recent call last):
  File "/home/ubuntu/gpt4all/generate.py", line 52, in <module>
    model, tokenizer = setup_model(config)
  File "/home/ubuntu/gpt4all/generate.py", line 20, in setup_model
    model = AutoModelForCausalLM.from_pretrained(config["model_name"], device_map="auto", torch_dtype=torch.float16)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 441, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/home/ubuntu/.local/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 623, in __getitem__
    raise KeyError(key)
KeyError: 'llama'

Any tips or tricks what does the config.json need to look like for this to work?

Use your own data

I know it has been covered elsewhere, but people need to understand is that you can use your own data but you need to train it.

So suggesting to add write a little guide so simple as possible.

gather sample.data
train sample.data
use chatbot with sample.data

There are thousand and thousand peoples waiting for this.

Could not read from remote repository.

I got the following error on Windows 11, when run this command:
F:>git clone --recurse-submodules [email protected]:nomic-ai/gpt4all.git
Cloning into 'gpt4all'...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Illegal Instruction on 2012 Intel i7

Hi there,

I'm trying to run this on a Mac-mini i7 (4-core/16GB) running Ubuntu, (I know this is going to perform really badly!)

When I run it I get Illegal Instruction Core Dumped

I assume this is because this old CPU is missing something, perhaps AVX?

Does anyone know what I would change in the makefile for gpt4all.cpp to get around this?

Thanks!

Small date typo in paper

Year should probably be 2023:

"We collected roughly one million prompt response pairs using the GPT-3.5-Turbo OpenAI API between March 20, 2022 and March 26th, 2022"

failed to open 'ggmI-alpaca-7b-q4.bin'

As it showed above,I cannot run the chat well. It s the problem when I open the chat.exec.

ModuleNotFoundError: No module named 'torch._six'

Thanks for putting this repo together!

I am encountering an issue when running this command - python generate.py --config configs/generate/generate.yaml --prompt "Write a script to reverse a string in Python":

Traceback (most recent call last):
  File "/home/Owner/Developer/gpt4all/generate.py", line 2, in <module>
    from peft import PeftModelForCausalLM
  File "/home/Owner/Developer/gpt4all/peft/src/peft/__init__.py", line 22, in <module>
    from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING, PEFT_TYPE_TO_CONFIG_MAPPING, get_peft_config, get_peft_model
  File "/home/Owner/Developer/gpt4all/peft/src/peft/mapping.py", line 16, in <module>
    from .peft_model import (
  File "/home/Owner/Developer/gpt4all/peft/src/peft/peft_model.py", line 22, in <module>
    from accelerate import dispatch_model, infer_auto_device_map
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/accelerate/__init__.py", line 7, in <module>
    from .accelerator import Accelerator
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/accelerate/accelerator.py", line 27, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/accelerate/utils/__init__.py", line 96, in <module>
    from .other import (
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/accelerate/utils/other.py", line 29, in <module>
    from deepspeed import DeepSpeedEngine
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/deepspeed/__init__.py", line 14, in <module>
    from . import module_inject
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/deepspeed/module_inject/__init__.py", line 1, in <module>
    from .replace_module import replace_transformer_layer, revert_transformer_layer, ReplaceWithTensorSlicing
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/deepspeed/module_inject/replace_module.py", line 15, in <module>
    from ..runtime.zero import GatheredParameters
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/deepspeed/runtime/zero/__init__.py", line 6, in <module>
    from .partition_parameters import ZeroParamType
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 22, in <module>
    from .linear import LinearModuleForZeroStage3, zero3_linear_wrap
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/deepspeed/runtime/zero/linear.py", line 20, in <module>
    from deepspeed.runtime.utils import noop_decorator
  File "/home/Owner/anaconda3/lib/python3.9/site-packages/deepspeed/runtime/utils.py", line 19, in <module>
    from torch._six import inf
ModuleNotFoundError: No module named 'torch._six'

How can I provide access to local files?

I would like to use that AI for translating files of a whole repository from Python to TypeScript. ChatGPT can do that, but just file by file.

How can I provide (limited) access to a local folder of my computer (mac)?

how can I continue to talk this is annoying cut off ?

Feature request: Docker Image

How to execute gpt4all from bash script or Node.js process instead of interactive prompt?

I went through the readme on my Mac M2 and brew installed python3 and pip3. Then replaced all the commands saying python with python3 and pip with pip3. I got to the point of running this command:

python generate.py --config configs/generate/generate.yaml --prompt "Write a script to reverse a string in Python"

Setting up model
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
    response.raise_for_status()
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/None/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/me/gpt/gpt4all/transformers/src/transformers/utils/hub.py", line 409, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1160, in hf_hub_download
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1501, in get_hf_file_metadata
    hf_raise_for_status(r)
  File "/opt/homebrew/Cellar/[email protected]/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 291, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-6424332a-26533405190ac1c961e12ed4)

Repository Not Found for url: https://huggingface.co/None/resolve/main/config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/me/gpt/gpt4all/generate.py", line 52, in <module>
    model, tokenizer = setup_model(config)
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/me/gpt/gpt4all/generate.py", line 20, in setup_model
    model = AutoModelForCausalLM.from_pretrained(config["model_name"], device_map="auto", torch_dtype=torch.float16)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/gpt/gpt4all/transformers/src/transformers/models/auto/auto_factory.py", line 441, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/gpt/gpt4all/transformers/src/transformers/models/auto/configuration_auto.py", line 905, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/gpt/gpt4all/transformers/src/transformers/configuration_utils.py", line 573, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/gpt/gpt4all/transformers/src/transformers/configuration_utils.py", line 628, in _get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/Users/me/gpt/gpt4all/transformers/src/transformers/utils/hub.py", line 424, in cached_file
    raise EnvironmentError(
OSError: None is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.

What am I missing, what do I do now? How do I get it to generate some output without using the interactive prompt? I was able to successfully download that 4GB file and put it in the chat folder and run the interactive prompt, but I would like to get this to be runnable as a shell or Node.js script, so I can programmatically make some calls. How can it be done?

Thanks!

P.S. For clarity, I basically want to do something like this:

gpt4all --prompt "List some dogs" > output.md

Please add a LICENSE file

Adding MIT License to the repository will make it clear that anyone can use, copy and modify this software for any purpose without any restrictions. It also makes it easier for people who want to contribute or collaborate with others on open-source projects using these licenses.

Disclaimer: the above comment was generated using gpt4all ;)

https://huggingface.co/nomic-ai/gpt4all-lora/tree/main
vs
https://huggingface.co/nomic-ai/gpt4all-lora-epoch-2/tree/main

Also is there a link to a merged GPU model and to the 16bit ggml model?

nomic-ai / gpt4all Goto Github PK

gpt4all's Issues

Hi, I am trying to build a Flask API version of generate.py. Before that, I tried to run it on my server and encountered an error, which is probably linked to the default YAML file distributed.

model/tokenizer

The script generates this error :

Can you please help me?

Recommend Projects

Recommend Topics

Recommend Org