shishirpatil / gorilla Goto Github PK

View Code? Open in Web Editor NEW

10.9K 101.0 872.0 198.4 MB

Gorilla: An API store for LLMs

Home Page: https://gorilla.cs.berkeley.edu/

License: Apache License 2.0

Python 95.79% Shell 0.25% C++ 0.12% JavaScript 3.09% Rust 0.45% Scheme 0.28% Dockerfile 0.02%

api llm api-documentation chatgpt gpt-4-api claude-api openai-api openai-functions

gorilla's Introduction

Gorilla: Large Language Model Connected with Massive APIs [Project Website]

🚒 GoEx: A Runtime for executing LLM generated actions like code & API calls GoEx presents “undo” and “damage confinement” abstractions for mitigating the risk of unintended actions taken in LLM-powered systems. Release blog Paper.

🎉 Berkeley Function Calling Leaderboard How do models stack up for function calling? 🎯 Releasing the Berkeley Function Calling Leaderboard. Read more in our Release Blog.

🏆 Gorilla OpenFunctions v2 Sets new SoTA for open-source LLMs 💪 On-par with GPT-4 🙌 Supports more languages 👌 Blog.

🔥 Gorilla OpenFunctions is a drop-in alternative for function calling! Release Blog

🟢 Gorilla is Apache 2.0 With Gorilla being fine-tuned on MPT, and Falcon, you can use Gorilla commercially with no obligations! ⛳

🚀 Try Gorilla in 60s

💻 Use Gorilla in your CLI with pip install gorilla-cli

📠 Checkout our blogs for all things tools-use/function-calling!

🗞️ Checkout our paper!

👋 Join our Discord!

Gorilla enables LLMs to use tools by invoking APIs. Given a natural language query, Gorilla comes up with the semantically- and syntactically- correct API to invoke. With Gorilla, we are the first to demonstrate how to use LLMs to invoke 1,600+ (and growing) API calls accurately while reducing hallucination. We also release APIBench, the largest collection of APIs, curated and easy to be trained on! Join us, as we try to expand the largest API store and teach LLMs how to write them! Hop on our Discord, or open a PR, or email us if you would like to have your API incorporated as well.

News

⏰: [04/01] Introducing cost and latency metrics into Berkeley function calling leaderboard!
🚀 [03/15] RAFT: Adapting Language Model to Domain Specific RAG is live! [MSFT-Meta blog] [Berkeley Blog]
🏆 [02/26] Berkeley Function Calling Leaderboard is live!
🎯 [02/25] OpenFunctions v2 sets new SoTA for open-source LLMs!
🔥 [11/16] Excited to release Gorilla OpenFunctions
💻 [06/29] Released gorilla-cli, LLMs for your CLI!
🟢 [06/06] Released Commercially usable, Apache 2.0 licensed Gorilla models
🚀 [05/30] Provided the CLI interface to chat with Gorilla!
🚀 [05/28] Released Torch Hub and TensorFlow Hub Models!
🚀 [05/27] Released the first Gorilla model! or 🤗!
🔥 [05/27] We released the APIZoo contribution guide for community API contributions!
🔥 [05/25] We release the APIBench dataset and the evaluation code of Gorilla!

Gorilla Gradio

Try Gorilla LLM models in HF Spaces or

Get Started

Inference: Run Gorilla locally inference/README.md

Evaluation: We have included prompts and responses for the APIBench with and without retrievers along with the Abstract Syntax Tree (AST) matching evaluation script at evaluation.

Repository Organization

Our repository organization is shown below.

The berkeley-function-call-leaderboard folder contains scripts for evaluating function-calling ability of models.
The data folder contains all the evaluation APIs (APIBench) and the community contributed APIs.
The eval folder contains all our evaluation code as well as the Gorilla outputs.
The inference folder contains all the inference code for running Gorilla locally.
The openfunctions folder contains the inference code for the OpenFunctions model(s).

For our dataset collections, all the 1640 API documentation is in data/api. We also include the APIBench dataset created by self-instruct in data/apibench. For evaluation, we convert this into a LLM-friendly chat format, and the questions are in eval/eval-data/questions, and the corresponding responses are in eval/eval-data/responses. We have also included the evaluation scripts are in eval/eval-scripts. This would be entirely sufficient to train Gorilla yourself, and reproduce our results. Please see evaluation for the details on how to use our evaluation pipeline.

Additionally, we have released all the model weights. gorilla-7b-hf-v0 lets you invoke over 925 Hugging Face APIs. Similarly, gorilla-7b-tf-v0 and gorilla-7b-th-v0 have 626 (exhaustive) Tensorflow v2, and 94 (exhaustive) Torch Hub APIs. gorilla-mpt-7b-hf-v0 and gorilla-falcon-7b-hf-v0 are Apache 2.0 licensed models (commercially usable) fine-tuned on MPT-7B and Falcon-7B respectively. We will release a model with all three combined with generic chat capability and community contributed APIs as soon as we can scale our serving infrastructure. You can run Gorilla locally from instructions in the inference/ sub-directory, or we also provide a hosted Gorilla chat completion API (see Colab)! If you have any suggestions, or if you run into any issues please feel free to reach out to us either through Discord or email or raise a Github issue.

gorilla
|-- berkeley-function-call-leaderboard (data and scripts to eval model's function-calling ability)
├── data
│   ├── api (TF/HF/TH APIs used in generating apibench)
│   │   ├── {api_name}_api.jsonl
│   ├── apibench (Evaluating LLM models) v-1.0
│   │   ├── {api_name}_train.jsonl, {api_name}_eval.jsonl
|   |── apizoo (Contributed by the community - evolving)
│   |   ├── username1.json
│   │   ├── username2.json
│   │   ├── ...
├── eval
│   ├── README.md
│   ├── get_llm_responses.py
│   ├── eval-scripts
│   │   ├── ast_eval_{api_name}.py
│   ├── eval-data
│   │   ├── questions
│   │   │   ├── API name
│   │   │   │   ├── questions_{api_name}_{eval_metric}.jsonl
│   │   ├── responses
│   │   │   ├── API name
│   │   │   │   ├── responses_{api_name}_Gorilla_FT_{eval_metric}.jsonl
│   │   │   │   ├── responses_{api_name}_Gorilla_RT_{eval_metric}.jsonl
├── inference
│   ├── README.md
│   ├── serve
│   │   ├── gorilla_cli.py
│   │   ├── conv_template.py
├── openfunctions
|   ├── openfunctions-v1 (data and scripts for openfunctions-v0 and v1)
|   ├── utils (parsing script for openfunctions-v2)
|   ├── inference_* (openfunctions-v2 hosted/local inference code)

Contributing Your API

We aim to build an open-source, one-stop-shop for all APIs, LLMs can interact with! Any suggestions and contributions are welcome! Please see the details on how to contribute. THIS WILL ALWAYS REMAIN OPEN SOURCE.

FAQ(s)

I would like to use Gorilla commercially. Is there going to be a Apache 2.0 licensed version?

Yes! We now have models that you can use commercially without any obligations.

Can we use Gorilla with other tools like Langchain etc?

Absolutely! You've highlighted a great aspect of our tools. Gorilla is an end-to-end model, specifically tailored to serve correct API calls (tools) without requiring any additional coding. It's designed to work as part of a wider ecosystem and can be flexibly integrated within agentic frameworks and other tools.

Langchain, is a versatile developer tool. Its "agents" can efficiently swap in any LLM, Gorilla included, making it a highly adaptable solution for various needs.

The beauty of these tools truly shines when they collaborate, complementing each other's strengths and capabilities to create an even more powerful and comprehensive solution. This is where your contribution can make a difference. We enthusiastically welcome any inputs to further refine and enhance these tools.

Check out our blog on How to Use Gorilla: A Step-by-Step Walkthrough to see all the different ways you can integrate Gorilla in your projects.

Project Roadmap

In the immediate future, we plan to release the following:

Propose a new task you would like to work on 🤩

Citation

If you use Gorilla or APIBench, please cite our paper:

@article{patil2023gorilla,
  title={Gorilla: Large Language Model Connected with Massive APIs},
  author={Shishir G. Patil and Tianjun Zhang and Xin Wang and Joseph E. Gonzalez},
  year={2023},
  journal={arXiv preprint arXiv:2305.15334},
}

gorilla's People

Contributors

Stargazers

Watchers

Forkers

joedevon jwong8314 codeaudit 0xcryptoshark devdoshi gmh5225 joskid mccharley lijameshao dinodefend fayazmiraz jxzhangjhu stjordanis evelynmitchell touristshaun zhangjyr evdcush zjackz ai-natural-language-processing-lab soltrinox deepsimple dumpmemory cisfran05 nekonton botoai standardgalactic techthiyanes ccaiccie quduoduo argent-oxidum jjhw shawnharmsen vasanth1302 quant2017 eltociear balogunolalere jaedukseo bellahigh192 nooproblem moerehman biancaperad ariuannyo edithnewam richardyads robergias xiangweizheng vpegasus angle2046 xiaoqin00 hyzwz zhengtu1122 nirkal dkzdev goswamig paulyang8 xinxiangbobby cyberflamego hotelzululima gzihdnapabufshaj cranie aicodehunt bravedrxutf tonywhite11 positioner skywolfdreamer abhinav70291 techieteee jrcribb macromachine ai-jie01 koychoo mhamedi sontoriyama royesha aigc-cook-book milica013910743 catherinezhou t-sumida kgopeneivin emmaxen richardsonjf hannesgith diaszharmakhanov iuriimattos2 aruncivicscience greysond aifylabs yihaocs kaganmumcu fang-zhang connally82 parparto smallw00d2211 mikestahelena shpetimhaxhiu app-creative damonclifford dbtjr1103 billran fiyen

gorilla's Issues

The returned results show garbled content?

The running command used is：
python3 serve/gorilla_cli.py --model-path model/gorilla-7b-th-v0/

But the returned results show garbled content

How did this problem arise and how should it be resolved?

How to run this project?

Describe the issue

I saw the scene described in the video, which seems to be running on the command line and obtaining API access methods through dialogue. But I didn't find where to run it to get such results. Do I need to train first or do I need to run a specific Python file? Please advise..

The bm25 and gpt-index scripts ?

          For the different retrievers, we use bm25 (https://en.wikipedia.org/wiki/Okapi_BM25), gpt-index simply uses `Davinci v1` from OpenAI to embed all the documents and do simple cosine similarity match during the inference time. For oracle, we just provided the golden truth answer to Gorilla. Hope this helps and let me know if there are any further questions!

Originally posted by @tianjunz in #21 (comment)

Would you be willing to release the bm25 and gpt-index scripts to help the community reproduce the experimental results?

it would be nice if gorilla could actually automate tasks

I would really like to see Gorilla AI to automate my boring tasks
I really want to see it actually automate my tasks instead of choosing a Bash command that does not work

context
here's the inspiration of the idea:
https://github.com/emcf/engshell

btw don't forget to use gorilla LLM since it's better than GPT-4

[feature] - Change the Gorilla picture to the one of ones below (if you like it)

what are the document retrievers mentioned in your paper?

Hi!

thanks for the wonderful work! During reading your paper, I'm confused about the document retrievers mentioned in your paper. You mentioned several of them, such as gpt and oracle. I cannot find more specific reference or hyperlinks in your paper. I'm wondering where can I find websites or illustrations of these retrievers?

Thank you.

Leveraging Llama 2

I don’t see any existing discussion about leveraging Meta’s new Llama 2 model. Curious if you guys have any plans in the making for using this new base model in gorilla.

Bump Anthropic dependency from 0.2.x to 0.3.x

We use Anthropic's Claude for evaluating Gorilla in eval/. This was tested for anthropic==0.2.8 release, and needs to be updated to support the latest PyPI release (0.3.x). This involves cosmetic changes in two files ‎eval/get_llm_responses.py and ‎eval/get_llm_responses_retriever.py

.

What's the GPTIndex, Oracle retriever in the paper?

Could you help share the reference of the oracle retriever? I can not find it from the paper.
Is the GPTIndex in the paper LLamaIndex? I know GPTIndex has been rename to LLamaIndex https://github.com/jerryjliu/llama_index and just like to confirm that. If so, what's the index method you are using? List? Tree or something else?

License?

Hello, thanks for making your work available! Have you chosen a license yet?

[feature] Run gorilla locally without GPUs 🦍

Today, Gorilla end-points run on UC Berkeley hosted servers 🐻 When you try our colab, or our chat completion API, or the CLI tool, it hits our GPUs for inference. A popular ask among our users is to run Gorilla locally on Macbooks/Linux/WSL.

Describe the solution you'd like:
Have the model(s) running locally on MPS/CPU/GPU and listening to a port. All the current gorilla end-points can then just hit localhost to get the response to any given prompt.

Additional context:
Here is an application that would immediately use it: https://github.com/gorilla-llm/gorilla-cli
Given, we have LLaMA models, these should be plug-and-play: ggerganov/llama.cpp and karpathy/llama2.c
Also relevant: https://huggingface.co/TheBloke/gorilla-7B-GPTQ

Update 1: If you happen to have an RTX, or V100 or A100 or H100, you can use Gorilla today without any latency hit. The goal of this enhancement is to help those who may not have access to and greatest GPUs.

deploying to replicate

Describe the solution you'd like
I would love to see a model of Gorilla hosted to Replicate, it would be nice to be able to utilize their API and hosting.
Additional context
Had a blast playing with the colab

timeline to release training codes?

Thanks for sharing the awesome work! do you have a rough estimate when will be you release the training codes?

[feature] FOOM detection.

This seems like the sort of project that could accidentally produce a self-improving superhuman system. Does anyone on the project have an understanding of AI Alignment? Are there efforts to measure the potential for systems built with gorilla to FOOM?

[bug] Hosted Gorilla: <Issue>

Exception: text
Failed model: gorilla-7b-hf-v0, for prompt: I would like to translate from English to French

[bug] Hosted Gorilla: <Issue>

Exception: Error communicating with OpenAI: HTTPConnectionPool(host='34.132.127.197', port=8000): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0ec18dabf0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed model: gorilla-7b-hf-v0, for prompt: I would like to translate from English to Chinese

[Apibench] Needs to be easy to query for which API and version fully supported

Needs to be easy to query for which API and version fully supported

Gorilla Self-Hosted

Hi,

is it also possible to self host Gorilla with an API that is compatible with the OpenAI chat completion API?
So essentially the same as depicted in the Colab?

Contributing APIs

I'd like to contribute the SkyPilot API. What's the best way to add it to Gorilla?

[bug] Hosted Gorilla: <Issue>

Exception: Invalid response object from API: '{"object":"error","message":"This model's maximum context length is 2048 tokens. However, you requested 2302 tokens (1790 in the messages, 512 in the completion). Please reduce the length of the messages or completion.","code":40303}' (HTTP response code was 400)

Is there any way to just cut the completion / request to the first 2048 tokens?

Incorrect Response. Can be augmented with own data?

[bug] Hosted Gorilla: <Issue>

Exception: Error communicating with OpenAI: HTTPConnectionPool(host='34.132.127.197', port=8000): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa8bdf53c40>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed model: gorilla-7b-hf-v0, for prompt: I would like to translate from English to Chinese

integrating discord webhooks with Repo

This would allow discord members to easily view GitHub commits/Issues/Pull Requests easily.
I wouldn't mind having access having access to certain settings in this Repo in order to set it up.
If you don't want to give perms you can use this guide here https://ardalis.com/integrate-github-and-discord-with-webhooks/

[bug] Hosted Gorilla: <Issue>

Exception: Invalid response object from API: '{"object":"error","message":"NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.\n\n(CUDA error: uncorrectable ECC error encountered\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nCompile with TORCH_USE_CUDA_DSA to enable device-side assertions.\n)","code":50001}' (HTTP response code was 400)
Failed model: gorilla-7b-hf-v1, for prompt: I would like to translate 'I feel very good today.' from English to Chinese

Encountered 1 file(s) that may not have been copied correctly on Windows

I encounter this problem downloading model weights. Seems weights larger than 4 GB are not correctly handled on Windows. Do you upload the models from windows system?

root@4bd793bb2ded:/workspace/gorilla# git lfs install
Updated git hooks.
Git LFS initialized.

root@4bd793bb2ded:/workspace/gorilla# git clone https://huggingface.co/gorilla-llm/gorilla-mpt-7b-hf-v0
Cloning into 'gorilla-mpt-7b-hf-v0'...
remote: Enumerating objects: 35, done.
remote: Counting objects: 100% (35/35), done.
remote: Compressing objects: 100% (34/34), done.
remote: Total 35 (delta 5), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (35/35), 621.68 KiB | 1.84 MiB/s, done.
Filtering content: 100% (2/2), 4.38 GiB | 57.36 MiB/s, done.
Encountered 1 file(s) that may not have been copied correctly on Windows:
        pytorch_model-00001-of-00002.bin

See: `git lfs help smudge` for more details.

root@4bd793bb2ded:/workspace/gorilla/gorilla-mpt-7b-hf-v0# ls -al
total 12989212
drwxr-xr-x 3 root root       4096 Jun  7 00:17 .
drwxr-xr-x 8 root root        161 Jun  7 00:16 ..
drwxr-xr-x 9 root root        174 Jun  7 00:18 .git
-rw-r--r-- 1 root root       1477 Jun  7 00:16 .gitattributes
-rw-r--r-- 1 root root       2068 Jun  7 00:16 README.md
-rw-r--r-- 1 root root       1752 Jun  7 00:16 adapt_tokenizer.py
-rw-r--r-- 1 root root      16818 Jun  7 00:16 attention.py
-rw-r--r-- 1 root root       2493 Jun  7 00:16 blocks.py
-rw-r--r-- 1 root root       1284 Jun  7 00:16 config.json
-rw-r--r-- 1 root root       9080 Jun  7 00:16 configuration_mpt.py
-rw-r--r-- 1 root root      28182 Jun  7 00:16 flash_attn_triton.py
-rw-r--r-- 1 root root        112 Jun  7 00:16 generation_config.json
-rw-r--r-- 1 root root      27219 Jun  7 00:16 hf_prefixlm_converter.py
-rw-r--r-- 1 root root       3639 Jun  7 00:16 meta_init_context.py
-rw-r--r-- 1 root root      17406 Jun  7 00:16 modeling_mpt.py
-rw-r--r-- 1 root root       2563 Jun  7 00:16 norm.py
-rw-r--r-- 1 root root      12558 Jun  7 00:16 param_init_fns.py
-rw-r--r-- 1 root root 9943040275 Jun  7 00:18 pytorch_model-00001-of-00002.bin
-rw-r--r-- 1 root root 3355599187 Jun  7 00:17 pytorch_model-00002-of-00002.bin
-rw-r--r-- 1 root root      16023 Jun  7 00:16 pytorch_model.bin.index.json
-rw-r--r-- 1 root root        129 Jun  7 00:16 special_tokens_map.json
-rw-r--r-- 1 root root    2113738 Jun  7 00:16 tokenizer.json
-rw-r--r-- 1 root root        264 Jun  7 00:16 tokenizer_config.json

[bug] Hosted Gorilla: <Issue>

Exception: Invalid response object from API: '{"object":"error","message":"","code":50001}' (HTTP response code was 400)
Failed model: gorilla-7b-hf-v1, for prompt: I would like to translate 'I feel very good today.' from English to Chinese

Discord invite link is invalid

The discord invite link expired please renew

Target module for Qlora?

lora_target_modules='["query_key_value"]' "not part of this model"

The provided response file test results are not consistent with the paper[bug] Hosted Gorilla: <Issue>

Describe the bug

We use the file /eval/eval-data/responses/torchhub/response_torchhub_Gorilla_FT_0_shot.jsonl and then use the code /eval/eval-scripts/ast_eval_th.py to calculate the metrics The final calculated result is Final Functionality accuracy: 75.80 Final hallucination: 16.12, which is the same as the final Functionality accuracy of zero-shot of torchhub published in Table1 of the paper. 59.13 Final hallucination: 6.98 is a big difference

To Reproduce
Steps to reproduce the behavior:

We use the file /eval/eval-data/responses/torchhub/response_torchhub_Gorilla_FT_0_shot.jsonl and then use the code /eval/eval-scripts/ast_eval_th.py to calculate the metrics

Screenshots
None

Proposed Solution
None

Additional context
We would like to know why there is a large discrepancy with the original published results, whether it is because an update was made or we compared the wrong table.

load-8bit flag doesn't work

Describe the issue
When I use the --load-8bit flag it's returning a load_compress_model that's not imported anywhere (and for that reason -I guess- it's failing?).

Any ideas on how to go about this issue? I've searched for this obj in the code itself and in hugging face's API but couldn't find it, so I'm kind of clueless on what to do.

I'm running this on a one GPU machine. It's an old T420 with archlinux.

Thanks!

Request to understand more on integration with other tools.

Hi, thanks for the demo video. Just wanted to understand how did you created gorilla spotlight feature and how can we create something out of it. Also i couldnt understand How we can integreate into langchain exactly.

idea: The DB-GPT project is doing something similar.

Thank you very much for your work. We are also implementing similar functionality through a plugin mechanism in our DB-GPT project.

open source: https://github.com/csunny/DB-GPT

RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

When applying these deltas to these base weights I get the following error:

$ python apply_delta.py --base-model-path ../../llama-7b-hf/ --target-model-path ../../gorilla-7b-hf-v0/ --delta-path ../../gorilla-7b-hf-delta-v0/
Loading the delta weights from ../../gorilla-7b-hf-delta-v0/
Traceback (most recent call last):
  File "/home/paperspace/projects/gorilla/gorilla/inference/apply_delta.py", line 167, in <module>
    apply_delta(args.base_model_path, args.target_model_path, args.delta_path)
  File "/home/paperspace/projects/gorilla/gorilla/inference/apply_delta.py", line 129, in apply_delta
    delta_tokenizer = AutoTokenizer.from_pretrained(delta_path, use_fast=False)
  File "/home/paperspace/.local/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 702, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/paperspace/.local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1811, in from_pretrained
    return cls._from_pretrained(
  File "/home/paperspace/.local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1965, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/home/paperspace/.local/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 96, in __init__
    self.sp_model.Load(vocab_file)
  File "/home/paperspace/.local/lib/python3.9/site-packages/sentencepiece/__init__.py", line 905, in Load
    return self.LoadFromFile(model_file)
  File "/home/paperspace/.local/lib/python3.9/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

Specs:

$ nvidia-smi
Thu Jun  1 17:50:22 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.105.01   Driver Version: 515.105.01   CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro M4000        Off  | 00000000:00:05.0  On |                  N/A |
| 46%   32C    P8    16W / 120W |    189MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1532      G   /usr/lib/xorg/Xorg                121MiB |
|    0   N/A  N/A      2011      G   /usr/bin/gnome-shell               59MiB |
|    0   N/A  N/A      2571      G   ...bexec/gnome-initial-setup        2MiB |
+-----------------------------------------------------------------------------+

$ LC_ALL=C lspci -v | grep -EA10 "3D|VGA" | grep 'prefetchable' 
	Memory at f4000000 (32-bit, prefetchable) [size=8M]
	Memory at f3000000 (32-bit, non-prefetchable) [size=16M]
	Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Memory at f0000000 (64-bit, prefetchable) [size=32M]

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           29Gi       1.2Gi       5.6Gi        13Mi        22Gi        27Gi
Swap:            0B          0B          0B

eval resutls

Hi, thanks for your excellent work.

I ran the eval-scrip

python ast_eval_th.py --api_dataset ../../data/api/torchhub_api.jsonl --apibench ../../data/apibench/torchhub_eval.json --llm_responses ../eval-data/responses/torchhub/response_torchhub_Gorilla_FT_0_shot.jsonl

and get the results:

Final Functionality accuracy:  0.7580645161290323
Final hallucination:  0.16129032258064516

I find these results are inconsistent with the results reported in the paper.

I would like to ask where I got it wrong.

Thanks.

[bug] Hosted Gorilla: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8a in position 9: invalid start byte

Hello,
I get this error when launching Gorilla on a Windows 10 PC:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8a in position 9: invalid start byte

It seems like the output works though.
Thank you.

Questions on evaluation data in the project and paper

Could I know what does FT and RT mean here? finetuning?
I notice it's hard for LLM to always generate right formats. Here's are some examples. How did you handle these kind of responses. Did you exclude them when you build the charts from the paper? Do you generate percentage of the invalid records?

https://github.com/ShishirPatil/gorilla/blob/main/eval/eval-data/responses/huggingface/response_huggingface_Gorilla_FT_0_shot.jsonl

How to generate training samples by self-Instruct

Thank you very much for your work！

In this repository, I see API data and training data, what prompt should I use to generate training data from API data.

Thanks！

[bug] Hosted Gorilla: <Issue>

Exception: Invalid response object from API: '{"object":"error","message":"","code":50001}' (HTTP response code was 400)
Failed model: gorilla-7b-hf-v1, for prompt: What is the ...

[bug] Hosted Gorilla: <Issue>

Exception: Error communicating with OpenAI: HTTPConnectionPool(host='34.132.127.197', port=8000): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8f4f57fc10>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed model: gorilla-7b-hf-v0, for prompt: I would like to translate from English to Chinese

De-duplicate APIBench eval data (?)

The evaluation data for APIBench is duplicated between data/apibench/*_eval.json and eval/eval-data/questions/. I think the only difference is formatting. Maybe we should just keep the eval/eval-data/responses and have data/apibench for only data used to train the model.

Initially we made two copies with the following rationale:
apibench should have all the data self-contained, which the community is using to train/benchmark their LLMs.
eval/ would have the eval data in a format that would be easy to eyeball and understand what is going on.

Maybe this is one of those few cases where it might be ok to have the same data twice in the repository in different formats?

Starting this issue in case anyone has comments on this.

[bug] Hosted Gorilla: <Issue>

Exception: Error communicating with OpenAI: HTTPConnectionPool(host='34.132.127.197', port=8000): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f912081fb50>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed model: gorilla-7b-hf-v0, for prompt: I would like to translate from English to Chinese

[bug] Hosted Gorilla: <Issue>

Exception: Error communicating with OpenAI: HTTPConnectionPool(host='34.132.127.197', port=8000): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f455ea077f0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed model: gorilla-7b-hf-v0, for prompt: I would like to translate from English to Chinese

Clean up legacy dependecies from `eval/`

We don't use codeblue anymore. It's just legacy code. Just need to verify and clean up any dangling references, and remove this dependency.

To remove: https://github.com/ShishirPatil/gorilla/tree/main/eval/eval-scripts/codebleu
Check for dependencies: https://github.com/ShishirPatil/gorilla/tree/main/eval/eval-scripts/*

inference/gorilla_eval.py funcion not implemented

in file inference/gorilla_eval.py when set --device mps the following function is not implemented

replace_llama_attn_with_non_inplace_operations()

GPT4 cutoff date is September 2021 - how did this impact evals?

Any new API info would not be in GPT4 training.

How much impact do you think this has with respect to relative performance between GPT4 and Gorilla?

Did you do any eval on APIs that existed prior to 09/21 versus those after?

I reviewed the paper but could not find any discussion on this. https://arxiv.org/abs/2305.15334

To be clear, I am not saying this invalidates the ideas, which I think were a fantastic contribution to OS LLMs, but rather that it would be good to understand the precise reason for the superior performance.

Augmenting additional API to the Gorilla-LLm

I hope you are doing well, a great thanks for this work.
Is it possible to add additional APIs(private APIs) to Gorilla? We have a large database of APIs and we need to add them to Gorilla, How can we do this? Should we fine-tune the Gorilla LLM? or something like this?

Train with mpt 8k

Is the feature request related to a problem?

Would it be expensive to train with mpt 8k? Can you provide an mpt 8k model?

Describe the solution you'd like
When I run gorilla, I want to see an 8k context window.

Prefer to keep Apache 2 licensing.

Additional context
Add any other context or screenshots about the feature request here.

https://huggingface.co/mosaicml/mpt-7b-8k

[bug] Testing Gorilla: <Issue>

Exception: Error communicating with OpenAI: HTTPConnectionPool(host='34.132.127.197', port=8000): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7bba974da140>: Failed to establish a new connection: [Errno 111] Connection refused'))
Failed model: gorilla-7b-tf-v0, for prompt: I want to build a robot that can detecting objects in an image

[bug] Hosted Gorilla: <Issue>

Exception: Invalid response object from API: '{"object":"error","message":"","code":50001}' (HTTP response code was 400)
Failed model: gorilla-7b-hf-v1, for prompt:

any idea why it failed?

Using the following open.api_base: http://34.132.127.197:8000/v1

shishirpatil / gorilla Goto Github PK

gorilla's Introduction

Gorilla: Large Language Model Connected with Massive APIs [Project Website]

News

Gorilla Gradio

Get Started

Repository Organization

Contributing Your API

FAQ(s)

Project Roadmap

Citation

gorilla's People

Contributors

Stargazers

Watchers

Forkers

gorilla's Issues

Recommend Projects

Recommend Topics

Recommend Org