lamini-ai / lamini Goto Github PK

View Code? Open in Web Editor NEW

2.5K 2.5K 155.0 14.56 MB

License: Apache License 2.0

Python 100.00%

lamini's People

Contributors

Stargazers

Watchers

Forkers

omonida kirillkazakov8 jrobinsonvm nashid thedch lpatmo progoney techthiyanes social-technologies tonyxia2016 yibit darth-veitcher lijinhua163 lydiayyang daoyuan14 xiechengmude fryeggs rsohlot crystal-tensor spicypen1992 qmx790 my-basement to-be-architect taigaaltai mbinlaksar dattgoswami zeyongj ebook008 zhangnn520 yyht dominikusbrian cestcest zhiheng75 nikki29o tomstsai vadaski lumpinif jarod-jin leoinuk01 nnuujj 18106574249 ymg2007 gzhdy jprobichaud kernelzeroday jianguo188 2023-paper-fun zhoulingjie guruace alphawhiskylou apollohuang1 rayjue chenzhuo1005 liseri ai-alebrijecircus-x afiqmuzaffar lgs hejingpeng saipklvs richardsonjf joshuawalcher oleksost nicklonlee longinteger017 alexhaoai python-popular-repos dtldhjh ailabteam kenhuangus soon14 chunhui-shi xfg0913 ifjgm005 errolyan feeeengym pablog12 tristanoprofetto vpegasus zhouhh2017 sayveprotocol arman-hk wpsadmns russ168 hyli001 huamichaelchen alexhao-space melandz oceans0423 zhaofangzhijia ai-jie01 charlie-xiaoqi qqq-tech binshi-bing seawolfxiwu lyrl positioner asdlei99 gshan4056 admingpt startime-h

lamini's Issues

Not able to train the mistral model

Seeing above error

Description

I am integrating lamini in my python backend, but I cannot because of lamini requirements for what concerns Pydantic.

lamini 1.0.3 depends on pydantic==1.10.*

All my dependencies require pydantic>=2.x.x. Can you please upgrade or let me know how to do it?

Why does my Lamini API key configuration not take effect?

Quick Guide to Use it with Colab?

ModelNameError: Not Found in Lamini: Finetuning for Free.ipynb colab

Hi,

I get this error:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/llama/program/util/run_ai.py](https://localhost:8080/#) in powerml_send_query_to_url(params, route)
    133         )
--> 134         response.raise_for_status()
    135     except requests.exceptions.Timeout:

6 frames
[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
   1020         if http_error_msg:
-> 1021             raise HTTPError(http_error_msg, response=self)
   1022 

HTTPError: 404 Client Error: Not Found for url: https://api.powerml.co/v1/llama/data

During handling of the above exception, another exception occurred:

ModelNameError                            Traceback (most recent call last)
[<ipython-input-4-de718df47cc0>](https://localhost:8080/#) in <cell line: 3>()
      1 # Train the model
      2 start=time.time()
----> 3 finetune_model.train(enable_peft=True)
      4 print(f"Time taken: {time.time()-start} seconds")

[/usr/local/lib/python3.10/dist-packages/llama/runners/question_answer_runner.py](https://localhost:8080/#) in train(self, verbose, finetune_args, enable_peft, peft_args, limit)
    118         else:
    119             qa_pairs = self.question_answer
--> 120         self.llm.save_data(qa_pairs)
    121 
    122         final_status = self.llm.train(task="question_answer", verbose=verbose, finetune_args=finetune_args, enable_peft=enable_peft, peft_args=peft_args)

[/usr/local/lib/python3.10/dist-packages/llama/program/builder.py](https://localhost:8080/#) in save_data(self, data)
    365         self.program.examples = []
    366         self.program.add_data(examples=data)
--> 367         results = gen_submit_data(self.program, self.id)
    368         return results
    369 

[/usr/local/lib/python3.10/dist-packages/llama/program/util/api_actions.py](https://localhost:8080/#) in gen_submit_data(program, id)
    138         "data": program["examples"],
    139     }
--> 140     response = query_submit_data(params)
    141     response.raise_for_status()
    142     return response.json()

[/usr/local/lib/python3.10/dist-packages/llama/program/util/run_ai.py](https://localhost:8080/#) in query_submit_data(params)
     83 
     84 def query_submit_data(params):
---> 85     resp = powerml_send_query_to_url(params, "/v1/llama/data")
     86     return resp
     87 

[/usr/local/lib/python3.10/dist-packages/llama/program/util/run_ai.py](https://localhost:8080/#) in powerml_send_query_to_url(params, route)
    137     except requests.exceptions.HTTPError as e:
    138         if response.status_code == 404:
--> 139             raise llama.error.ModelNameError(
    140                 response.json().get("detail", "ModelNameError")
    141             )

ModelNameError: Not Found

at this step of the notebook:

# Train the model
start=time.time()
finetune_model.train(enable_peft=True)
print(f"Time taken: {time.time()-start} seconds")

I have tried adding another model name but with the same result, is it an issue my end or possibly with the API?

Colab examples"Question-Answer LLM Finetuning" Error

Problem1:

!pip install --upgrade --force-reinstall --ignore-installed lamini

When I just run this script,there will be an error:"AttributeError: module 'numpy.linalg._umath_linalg' has no attribute '_ilp64'"

I have solved this problem by myself.We can change it into

!pip install --upgrade  lamini==0.0.19

Problem2:

It's my first time to use lamini,I still have some available credits.

Can you tell me how to resolve the problem?

Response Data Type

I think I found a bug in generate_data.py, line 37:

 add_response_data = int(arguments["response_data"])

This implementation attempts to convert its value to an integer, but the response_data flag is intended to be a boolean.

Server Issue

Is there any way to check whether the Lamini server was down or up ?

i submitted this job: https://app.lamini.ai/train/6143. it's completed successfully. but when i refresh the browser. it is loading for a while... still loading not able to see the page.

playground not working after turning finished

This is a followup issue on #24

Using the colab https://colab.research.google.com/drive/1QMeGzR9FnhNJJFmcHtm9RhFP3vrwIkFn?usp=sharing I was able to finish the training, and eval step works in colab notebook,

But playground is not functioning correctly.

Here is the shared link https://app.lamini.ai/share?model_id=2ce3948f4604919804d0c0e422b2410f8940f66e59ac01ced862378de2871274

[Documentation] There is no "About" description on Github repository

There is no description under the About panel.

No description, website, or topics provided.

I fine tune the model with instructions using peft but they performance are not good. Is there any other way to fine tune?
Without instructions I try small model and they work well but I want to do instructions....

status code 400

When I run the following code:

chatgpt = BasicModelRunner("chat-gpt")
print(chatgpt("Tell me how to train my dog to sit"))

Output

Chinese support

Thank you for sharing!
Does it support Chinese data generation?

Error in loading the API key from config file

Traceback (most recent call last):
File "C:\Users\tytun\desktop\Lamini_test.py", line 9, in
llm(test, output_type=Test)
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\llama\program\builder.py", line 48, in call
return run(value)
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\llama\program\value.py", line 185, in gen_value
value._compute_value()
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\llama\program\value.py", line 72, in _compute_value
response = query_run_program(params)
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\llama\program\util\run_ai.py", line 10, in query_run_program
key, url = get_url_and_key()
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\llama\program\util\run_ai.py", line 111, in get_url_and_key
key = cfg["production.key"]
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\config\configuration_set.py", line 110, in getitem
return self._from_configs("getitem", item)
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\config\configuration_set.py", line 67, in _from_configs
raise last_err
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\config\configuration_set.py", line 61, in from_configs
values.append(getattr(config, attr)(*args, **kwargs))
File "C:\Users\tytun\AppData\Local\Programs\Python\Python310\lib\site-packages\config\configuration.py", line 155, in getitem
raise KeyError(item)
KeyError: 'production.key'

Lamini_test.py contain only the code in Basic test section of https://lamini-ai.github.io/#setup-your-keys.
The config file is in desktop/.powerml/

Pricing

What's the pricing model for inference using your service? It's not advertised anywhere?

The seed file location in README and `./training_and_inference.py` is out of dated

In the README instructions, we havemodel.load_question_answer_from_jsonlines("seed.jsonl").

However, we don't have the seed.jsonl file in the data directory.

data
├── seed_bts.csv
├── seed_lamini_docs.jsonl
├── seed_open_llm.jsonl
└── seed_taylor_swift.jsonl

User permission?

when I run this below:

finetune_model = QuestionAnswerModel(model_name="EleutherAI/pythia-410m-deduped-v0")
answer = finetune_model.get_answer("What is Lamini?")
print(answer)

Got error:

UserError: Currently this user has support for base models: ['hf-internal-testing/tiny-random-gpt2', 'EleutherAI/pythia-70m', 'EleutherAI/pythia-70m-deduped', 'EleutherAI/pythia-70m-v0', 'EleutherAI/pythia-70m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-70m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-70m-v1', 'EleutherAI/neox-ckpt-pythia-70m-deduped-v1', 'EleutherAI/gpt-neo-125m', 'EleutherAI/pythia-160m', 'EleutherAI/pythia-160m-deduped', 'EleutherAI/pythia-160m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-70m', 'EleutherAI/neox-ckpt-pythia-160m', 'EleutherAI/neox-ckpt-pythia-160m-deduped-v1', 'EleutherAI/pythia-410m', 'EleutherAI/pythia-410m-v0', 'EleutherAI/pythia-410m-deduped', 'EleutherAI/pythia-410m-deduped-v0', 'EleutherAI/neox-ckpt-pythia-410m', 'EleutherAI/neox-ckpt-pythia-410m-deduped-v1', 'cerebras/Cerebras-GPT-111M', 'cerebras/Cerebras-GPT-256M', 'meta-llama/Llama-2-7b-hf', 'meta-llama/Llama-2-7b-chat-hf']

Why? EleutherAI/pythia-410m-deduped-v0 not supported?

colab example train failed with no log in https://app.lamini.ai/train

I try to follow the https://colab.research.google.com/drive/1QMeGzR9FnhNJJFmcHtm9RhFP3vrwIkFn?usp=sharing on readme

The train step

start=time.time()
finetune_model.train(enable_peft=True)
print(f"Time taken: {time.time()-start} seconds")

always failed with

Training job submitted! Check status of job 3459 here: https://app.lamini.ai/train
Job failed: {'job_id': 3459, 'status': 'FAILED', 'start_time': '2023-09-27T12:52:33.304092', 'model_name': None, 'custom_model_name': None, 'is_public': None}
Time taken: 35.00070023536682 seconds

I tried to specify different model names but it does not help. What make it harder is the log tab on https://app.lamini.ai/train is also empty.

how should I go from here? any suggestions are greatly appreciated!

Finetuning job fails on a short custom dataset

I'm trying out the finetuning in with following code

import pandas as pd
from llama import QuestionAnswerModel

data = pd.read_json('data/seed_lamini_docs.jsonl', lines=True).to_dict(orient='records')
model = QuestionAnswerModel(model_name="EleutherAI/pythia-410m-deduped-v0")
model.load_question_answer(data)
model.train(verbose=True)

where data/seed_lamini_docs.jsonl is a copy of this file. The finetuning process completes without problems. However when I change the data to my own dataset, the train job fails without any error message. I also tried to see at https://app.lamini.ai/train for some kind of error logs so I could fix the issue(s) with my dataset. My dataset has the same format as seed_lamini_docs.jsonl, except it's only ~30 lines long. Is there a minimum length for the fine tuning dataset?

Unable to import RetrievalAugmentedRunner

I cannot import RetrievalAugmentedRunner and DirectoryLoader, LaminiIndex, QueryEngine these classes also have the same problem.
As per the official documentation, the correct way of importing these classes is directly from the lamini package. Also could not find these classes in the code here in this repo.
I have also raised the same issue here.

I am running an Ubuntu 22.04 with Python 3.10.13.

If you need any more details, please do let me know.

Error in Code Guru Walkthrough

Hello,

I am trying to implement your code guru walkthrough (https://lamini-ai.github.io/Examples/code_guru/#define-the-llm-interface) and getting the following error. Could you please take a look?

Inptut
question = Question(question='LLMEngine.save_data')
function = Function(
name='init',
code='def init(self, builder, name):\n self.builder = builder\n self.name = name\n self.main = Function(program=self, name="main")\n self.functions = {"main": self.main}\n self.examples = []'
)
llm.save_data([function])
answer = llm(input=Question(question=question), output_type=Answer)

print(answer)

Output Error

ValidationError Traceback (most recent call last)
/home/ubuntu/lamini/main.ipynb Cell 20 line 7
2](vscode-notebook-cell://ssh-remote%2B10.112.30.123/home/ubuntu/lamini/main.ipynb#X25sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1'%3E2%3C/a%3E) function = Function(
3](vscode-notebook-cell://ssh-remote%2B10.112.30.123/home/ubuntu/lamini/main.ipynb#X25sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'%3E3%3C/a%3E) name='init',
4](vscode-notebook-cell://ssh-remote%2B10.112.30.123/home/ubuntu/lamini/main.ipynb#X25sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3'%3E4%3C/a%3E) code='def init(self, builder, name):\n self.builder = builder\n self.name = name\n self.main = Function(program=self, name="main")\n self.functions = {"main": self.main}\n self.examples = []'
5](vscode-notebook-cell://ssh-remote%2B10.112.30.123/home/ubuntu/lamini/main.ipynb#X25sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'%3E5%3C/a%3E) )
6](vscode-notebook-cell://ssh-remote%2B10.112.30.123/home/ubuntu/lamini/main.ipynb#X25sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5'%3E6%3C/a%3E) llm.save_data([function])
----> 7](vscode-notebook-cell://ssh-remote%2B10.112.30.123/home/ubuntu/lamini/main.ipynb#X25sdnNjb2RlLXJlbW90ZQ%3D%3D?line=6'%3E7%3C/a%3E) answer = llm(input=Question(question=question), output_type=Answer)
9](vscode-notebook-cell://ssh-remote%2B10.112.30.123/home/ubuntu/lamini/main.ipynb#X25sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'%3E9%3C/a%3E) print(answer)

File ~/lamini/lib/python3.10/site-packages/llama/types/type.py:18, in Type.init(self, *args, **kwargs)
16 object.setattr(self, "fields_set", unvalidated.fields_set)
17 else:
---> 18 super().init(*args, **kwargs)
20 self._value = Value(type(self), data=self)

File ~/lamini/lib/python3.10/site-packages/pydantic/main.py:341, in pydantic.main.BaseModel.init()

ValidationError: 1 validation error for Question
question
str type expected (type=type_error.str)

French Support

Thanks for letting us know!
Can it be used for several languages?

If not, please share the steps on how I can train it for French data.

Data quality

Has anyone actually looked at the data? It looks quite bad. I know it is free and all but people might actually use this and I can't see how this will improve any model.

lamini apis broke on JAn 26, 2024

Hello,

I was testing a small RAG bot using lamini APIs and this morning the working code stopped working. Once I updated to lamini python lib version 2.0.8. I am getting API error 502

Bad Gateway for url: https://api.lamini.ai/v1/inference/embedding

Did the endpoint change? I dont see any document
site-packages\lamini\api\rest_requests.py", line 176, in make_web_request
raise APIError(f"API error {description}")
lamini.error.error.APIError: API error 502

with a larger txt file I was getting model name error.

Any guidance please?

The Playground can't show answer？

llama.error.error.ModelNameError: Not Found

non_finetuned = BasicModelRunner(model_name="meta-llama/Llama-2-7b-hf", config={ "production": { "key": "xxxxxxxxxxx", } }) non_finetuned_output = non_finetuned("Tell me how to train my dog to sit") print(non_finetuned_output)
and
runs exception as this

`Traceback (most recent call last):
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/llama/program/util/run_ai.py", line 134, in powerml_send_query_to_url
response.raise_for_status()
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.powerml.co/v1/llama/run_program

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/dingli/PycharmProjects/my-llama-index/lamini_fine_tuning.py", line 11, in
non_finetuned_output = non_finetuned("Tell me how to train my dog to sit")
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/llama/runners/basic_model_runner.py", line 39, in call
output_objects = self.llm(
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/llama/program/builder.py", line 77, in call
result = gen_value(value)
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/llama/program/util/api_actions.py", line 178, in gen_value
value._compute_value()
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/llama/program/value.py", line 65, in _compute_value
response = query_run_program(params)
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/llama/program/util/run_ai.py", line 11, in query_run_program
resp = powerml_send_query_to_url(params, "/v1/llama/run_program")
File "/Users/dingli/PycharmProjects/my-llama-index/venv/lib/python3.10/site-packages/llama/program/util/run_ai.py", line 139, in powerml_send_query_to_url
raise llama.error.ModelNameError(
llama.error.error.ModelNameError: Not Found
`

Quick Start Issue

Hello friends, wanted to let you know that in your quick tour (https://lamini-ai.github.io/) there is an error. It appears that

llm = LlamaV2Runner()
llm.train(data=data)

should be

llm = LlamaV2Runner()
llm.data = data
llm.train()

Can I exchange the underlying model for data generation?

Hi there,
can I change the model that is being used for data generation? I would like to use the wizardLM model for instance.