Code Monkey home page Code Monkey logo

localllm_guidance's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

localllm_guidance's Issues

Not returning "Final Answer".

Hello Bud - Thx for this great work!

I testet your code with the same vicuna model. If I run the agent I get the following output:

`### Input:
What is the population of Congo?

Response:

Question: What is the population of Congo?
Thought: First, I need to find the population of Congo.
Action: Google Search
Action Input: What is the population of Congo?
Observation: 95.89 million (2021)
Thought: Now, I should check the source of this information.
Action: Google Search
Action Input: Is 95.89 million the correct population of Congo?
Observation: Democratic Republic of the Congo: Population size, in millions: For that indicator, ... minimum of 15.28 million in 1960 and a maximum of 95.89 million in 2021. From 1960 to 2021 the population of the Democratic Republic of the Congo increased from 15.25 million to 95.89 million people. This is a growth of 528.9 ... the DR Congo 2020 population is estimated at 89,561,403 people at mid year according to UN data. the DR Congo population is equivalent to 1.15% of the total ... Missing: 95.89 | Must include:95.89. Congo 2020 population is estimated at 5,518,087 people at mid year according to UN data. Congo population is equivalent to 0.07% of the total world population. Missing: 95.89 million. Population, Total. 95.89 million. (as of 2021) ; GNI, Atlas Method. 52.54 billion. (in current US$ as of 2021) ; GNI per capita, Atlas Method. 550. (in current US ... The Republic of Congo (Congo-Brazzaville), located in Central Africa, covers 342,000 km2. Its population of 5.7 million people is largely young, ... Missing: 95.89 | Must include:95.89. By land area, the DRC is the second-largest country in Africa, after Algeria, and the 11th-largest in the world.With a population of around 112 million, ... Country Characteristics ; Population, Total. 95.89 million. (as of 2021) ; GNI, Atlas Method. 52.54 billion. (in current US$ as of 2021) ; GNI per capita, Atlas ... With a current population of about 89.56 million, DR Congo is projected to surpass 100 million in 2024 and double its population by 2047. DR Congo has one of ... Missing: 95.89 | Must include:95.89.
Thought: I now know the final answer.
Action: Google Search
Action Input: What is the population of Congo?
Observation: 95.89 million (2021)
Thought: I now know the final answer.
Final Answer: The population of Congo is 95.89 million (2021).`

If I change the num_iter parameter to 5, the agent plays the game further. I cannot find the solution - maybe you have an answer?
(The agents knows the final answer but continues with the search).

Edit: I See - you have the same "problem" in the gradio demo image. The model returns: "I know now the final answer" And the next action is "Google Search" instead "Final Answer:".

Use cycles with stop for agents

Hi @QuangBK , congrats for your work here, really nice. I was playing around with your agent and it didn't work for what I was after, because some of my agents require the program to search the internet up to 5 times. So I've tried to extend your concept with a {{#each}} cycle. It definitely works, but I was hoping to find a better way to exit the cycle when the final solution is reached. Any thoughts? Thank you!

Here's your code, modified for my use case:


prompt_template = """
(...)
### Input:
{{question}}

### Response:
Question: {{question}}
{{#each iterations}}
Thought: {{gen 'thought' stop='\\n'}}
{{#if (== thought " I now know the final answer.")}}
Final Answer: {{gen 'final' stop='\\n'}}
{{await 'instruction'}}
{{/if}}
Action: {{select 'tool_name' options=valid_tools}}
Action Input: {{gen 'actInput' stop='\\n'}}
Observation:{{search actInput}}
{{/each}}"""

prompt = guidance(prompt_template)
result = prompt(
    question="How old is the Portuguese president's wife?", 
    search=searchGoogle, 
    valid_answers=valid_answers, 
    valid_tools=valid_tools,
    iterations=[1,2,3,4,5]
)

Example agent for local sql database

Thank you for the outstanding effort! Would it be possible to include an example demonstrating how to retrieve SQL data from a local database? Your assistance is greatly appreciated.

RuntimeError: There is no current event loop in thread 'AnyIO worker thread'.

What I've done:

  1. conda create -n gptq python=3.9 -y
  2. conda activate gptq
  3. conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
  4. git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git
  5. cd GPTQ-for-LLaMa
  6. pip install -r requirements.txt
  7. python setup_cuda.py install
  8. pip install guidance
  9. pip install langchain
  10. pip install gradio
  11. python app.py

What I see

Server starts but when I try to run any question I get the following run -time error:

`(gptq) david@shodan:~/Documents/Programming/Personal/llm/langchain/localLLM_guidance-main$ python app.py
start to install package: redis
successfully installed package: redis
start to install package: redis_om
successfully installed package: redis_om
Loading model ...
/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/safetensors/torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
with safe_open(filename, framework="pt", device=device) as f:
Done.
Running on local URL: http://0.0.0.0:7860

To create a public link, set share=True in launch().
Traceback (most recent call last):
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/gradio/routes.py", line 516, in predict
output = await route_utils.call_process_api(
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/gradio/route_utils.py", line 219, in call_process_api
output = await app.get_blocks().process_api(
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/gradio/blocks.py", line 1437, in process_api
result = await self.call_function(
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/gradio/blocks.py", line 1109, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/gradio/utils.py", line 650, in wrapper
response = f(*args, **kwargs)
File "/home/david/Documents/Programming/Personal/llm/langchain/localLLM_guidance-main/app.py", line 23, in greet
final_answer = custom_agent(name)
File "/home/david/Documents/Programming/Personal/llm/langchain/localLLM_guidance-main/server/agent.py", line 71, in call
prompt_start = self.guidance(prompt_start_template)
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/guidance/init.py", line 22, in call
return Program(template, llm=llm, cache_seed=cache_seed, logprobs=logprobs, silent=silent, async_mode=async_mode, stream=stream, caching=caching, await_missing=await_missing, logging=logging, **kwargs)
File "/home/david/miniconda3/envs/gptq/lib/python3.9/site-packages/guidance/_program.py", line 155, in init
self._execute_complete = asyncio.Event() # fires when the program is done executing to resolve await
File "/home/david/miniconda3/envs/gptq/lib/python3.9/asyncio/locks.py", line 177, in init
self._loop = events.get_event_loop()
File "/home/david/miniconda3/envs/gptq/lib/python3.9/asyncio/events.py", line 642, in get_event_loop
raise RuntimeError('There is no current event loop in thread %r.'
RuntimeError: There is no current event loop in thread 'AnyIO worker thread'.
`

Modifications to the code:

os.environ["SERPER_API_KEY"] = '<my SERPER_API_KEY>' MODEL_PATH = '/home/david/ai/text-generation-webui/models/TheBloke_Wizard-Vicuna-7B-Uncensored-GPTQ/' CHECKPOINT_PATH = '/home/david/ai/text-generation-webui/models/TheBloke_Wizard-Vicuna-7B-Uncensored-GPTQ/model.safetensors'
Any clues as to why I'm getting a RuntimeError: There is no current event loop in thread 'AnyIO worker thread' error?

Many thanks,
Dave

How to get the required model?

I noticed you use GPTQ, but apparently it's incompatible with M1 Mac. Can you please generalize your project so that any model can be selected? Even the models that M1 Macs can run.

OOM - llama2 model - guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

related - guidance-ai/guidance#328

import gradio as gr
import guidance
import torch
from server.model import load_model_main
from server.tools import load_tools
from server.agent import CustomAgentGuidance

import os
os.environ["SERPER_API_KEY"] = 'REDACTED-BUT-DID-INCLUDE-IT'

MODEL_PATH = '/home/quang/working/LLMs/oobabooga_linux/text-generation-webui/models/TheBloke_wizard-mega-13B-GPTQ'
CHECKPOINT_PATH = '/home/quang/working/LLMs/oobabooga_linux/text-generation-webui/models/TheBloke_wizard-mega-13B-GPTQ/wizard-mega-13B-GPTQ-4bit-128g.no-act.order.safetensors'
DEVICE = torch.device('cuda:0')

examples = [
    ["How much is the salary of number 8 of Manchester United?"],
    ["What is the population of Congo?"],
    ["Where was the first president of South Korean born?"],
    ["What is the population of the country that won World Cup 2022?"]    
]

def greet(name):
    final_answer = custom_agent(name)
    return final_answer, final_answer['fn']

# model, tokenizer = load_model_main(MODEL_PATH, CHECKPOINT_PATH, DEVICE)
# llama = guidance.llms.Transformers(model=model, tokenizer=tokenizer, device=0)
# guidance.llm = llama 
## OVER-RIDIING HERE------------------------------------->>>>>>>>>>>>>>>>>>
guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

dict_tools = load_tools()

custom_agent = CustomAgentGuidance(guidance, dict_tools)

list_outputs = [gr.Textbox(lines=5, label="Reasoning"), gr.Textbox(label="Final Answer")]
demo = gr.Interface(fn=greet, inputs=gr.Textbox(lines=1, label="Input Text", placeholder="Enter a question here..."), 
                    outputs=list_outputs,
                    title="Demo ReAct agent with Guidance",
                    description="The source code can be found at: https://github.com/QuangBK/localLLM_guidance/",
                   examples=examples)
demo.launch(server_name="0.0.0.0", server_port=7860)
File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 216, in forward
    down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 23.69 GiB total capacity; 21.41 GiB already allocated; 4.94 MiB free; 21.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Error in program:  CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 23.69 GiB total capacity; 21.41 GiB already allocated; 4.94 MiB free; 21.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/gradio/routes.py", line 442, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/gradio/blocks.py", line 1392, in process_api
    result = await self.call_function(
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/gradio/blocks.py", line 1097, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/gradio/utils.py", line 703, in wrapper
    response = f(*args, **kwargs)
  File "/media/2TB/localLLM_guidance/app.py", line 23, in greet
    final_answer = custom_agent(name)
  File "/media/2TB/localLLM_guidance/server/agent.py", line 77, in __call__
    if result_mid['answer'] == 'Final Answer':
  File "/home/oem/miniconda3/envs/torch2/lib/python3.10/site-packages/guidance/_program.py", line 470, in __getitem__
    return self._variables[key]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.