Code Monkey home page Code Monkey logo

Comments (7)

jermwatt avatar jermwatt commented on May 23, 2024 2

@hawkeye-sama - yeah it would.

How about we spin up the model (and keep it running) via python's subprocess and write/read in via a subprocess handler? Then when we're done we terminate it. Something like this excellent description of repl control which I crudely adopt for our use case below.

Note: this would be run from the home dir of the repo, and I'm testing on a mac with M1.

import time, subprocess

# start up the process - keep it open.  Note the `cwd=chat` (run this from home dir)
def start(executable_file):
    return subprocess.Popen(
        executable_file,
        cwd="chat", 
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE)

# read inference output from subprocess handler
def read(process):
    return process.stdout.readline().decode("utf-8").strip()

# pass text to model via subprocess handler
def write(process, message):
    process.stdin.write(f"{message.strip()}\n".encode("utf-8"))
    process.stdin.flush()

# terminate model 
def terminate(process):
    process.stdin.close()
    process.terminate()
    process.wait(timeout=0.2)
    
# interact with model
def model_inf(process,text):
    # pass a test string to model
    write(process,text)

    # sleep while processes input - again a cheap hack
    time.sleep(3)

    # read and print model output via subprocess handler
    print(read(process))
    

## test the functionality above ##
process = start("./gpt4all-lora-quantized-OSX-m1")

# sleep while model loads - a cheap hack (we can do better)
time.sleep(5)

# read spinup output from model loading via subprocess handler
print(read(process))

# send text to model, report output
text = "what is the capital of france?"; print(text)
model_inf(process,text)

text = "what is the square root of 3?"; print(text)
model_inf(process,text)

## we could go on writing and reading... until at last 
# terminate via subprocess handler
terminate(process)

print out from running the above:

what is the capital of france?
> The Capital city of France is Paris, which has been its seat since 1368 AD when King Charles V moved his court there from Poitiers.


what is the square root of 3?
> The Square Root Of Three Is Nine (or Sixteen if you round to two decimal places). The number nine has been used as a placeholder for this answer since it's an integer that can be easily divided by three.

This could rolled into a bash script, or the pattern translated to js (not sure of equivalent to subprocess but sure it exists).

We might be able to roll something like this into a wrapper - pip installable package for the executable.

from gpt4all.

pratt3000 avatar pratt3000 commented on May 23, 2024

What do mean by interactive prompt? I am facing the same issue rn.

from gpt4all.

khizarhussain19 avatar khizarhussain19 commented on May 23, 2024

I am facing the exact same issue. Any luck ?

from gpt4all.

lancejpollard avatar lancejpollard commented on May 23, 2024

The interactive prompt, when you run this:

cd chat;./gpt4all-lora-quantized-OSX-m1

It gets you into a REPL basically. I don't want the REPL, I want to call the program as a bash script, for example.

from gpt4all.

jermwatt avatar jermwatt commented on May 23, 2024

you can use the -p flag as illustrated in the example below (the output requires parsing):

./gpt4all-lora-quantized-OSX-m1 -p "What is the capital of France?"

response:

main: seed = 1680139956
llama_model_load: loading model from 'gpt4all-lora-quantized.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.35 MB
llama_model_load: memory_size =  2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'gpt4all-lora-quantized.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000


The current (2019) capital city of France, Paris
 [end of text]

from gpt4all.

hawkeye-sama avatar hawkeye-sama commented on May 23, 2024

@jermwatt this would load the model each time wouldn't it? I think having the model preloaded in to memory and then processor ends up running the task when we do something like this as mentioned above?

gpt4all --prompt "List some dogs" > output.md

from gpt4all.

hawkeye-sama avatar hawkeye-sama commented on May 23, 2024

Oh gotta try this. Thanks, probably would try to hook it up to an API and see how that goes.

from gpt4all.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.