yival / yival Goto Github PK

View Code? Open in Web Editor NEW

2.5K 241.0 424.0 22.33 MB

Your Automatic Prompt Engineering Assistant for GenAI Applications

Home Page: https://yival.io/

License: Apache License 2.0

Python 92.97% CSS 0.47% JavaScript 1.33% HTML 0.11% Jupyter Notebook 3.41% Dockerfile 1.70%

ai prompt llm ai-experiments ai-toolkit promptengineering aigc generative-ai prompt-engineering fine-tuning

yival's People

Stargazers

Watchers

Forkers

typerclub coder-drinker awekling luluchou ntt720 farmingtong 0x8235 zaku-zaku lycokie s8xy joaxan tutuna wensiyuansix herpacker iam20cm minisoco cerviny closegoingaway billionerd vamoko xupercoin voyageren jiiyf molierflower moguijoe hay-man bai-ca hisstar patagoia krajole n0wwa ymzhang96 maishall jawking yipeisu crud-boy e-kiss-me maimaiyude johabt poorlet mixin97 manraychan nicolesherwood chuuiin soithai fskeo staccats damazonia techthiyanes emxijo kiwikraze masemxiao mistyr0se besi1z boshiho exlibrise jumpstarters imistress 65533 spicyguml patvinity bug-keeper nicbair vapeaholix iabaii iimuka bigchai666 sheeplao d3p10y coolwife obsidian6s roful skysqlite gangname gimmeless monsterdove paramedick tufo830 hs991023 weilovewei fivetrans jbluv qrmx mayottee maigone windb3ll gxs92 arloi piapplepi taoxiancao nanpusher excelisa kobeio w90o0u milehigher pkm3 seatuner skupan nevercum cortadoo

yival's Issues

Additional metrics in final evaluation / reporting

Feature request to include time-based metrics around model training and auto-tuning tasks in the final report, to help optimize environments for larger data sets

Unable to execute demo

Describe the bug
TypeError: OpenAIPromptBasedVariationGeneratorConfig.init() got an unexpected keyword argument 'model_name'

To Reproduce
When I execute ;yival run demo/configs/animal_story.yml' I get those error

Expected behavior
Successful launch of demo

Version and Logs
1.1

Screenshots (Optional)
If applicable, add screenshots to help explain your problem.

Additional context (Optional)
Add any other context about the problem here.

Support Train Test Data split in Data Generation

In the current user flow, If we only use dataset generation, combination, define our custom function and evaluation we are fine consider all data in dataset generation as test data and input for evaluation. e.g. the headline example.

if we take improvement into consideration, then we cannot use 100% data in dataset generation as data for evaluation and improvement .It is like use both data as trainnig data and testing data.

The suggested modification is to include train test split in the dataset generation. If not specified, it is considered as 100% test data. ow follow the config specification.

Integrate with https://github.com/cleanlab/cleanlab

Explore to see if we can leverage this to clean up model generated data

Prepare release pypi action.

Resume an experiment

Is your feature request related to a problem? Please describe.
If an experiment stops due to some reason (network issue etc which causes yival to terminate), we lose all the experiment results, and have to run it again. This would be costly when we are running gpt-4 based evaluation for many data points.

Describe the solution you'd like
Save the experiment state and results on disk when running experiment, and add the ability to resume the experiment to yival.

Describe alternatives you've considered
It might help if we keep a cache of LLM calls. But I think a cache is not always favorable as sometimes we want LLM to generate new output on every request.

Init mkdocs with code.

Concurrency issue, global variable not enabled?

https://rcc5lo3n39a.larksuite.com/docx/ZtYpdJiYtouPiQx2WYIuFcDasRQ?from=from_copylink

Opro_enhancer link not working

https://colab.research.google.com/drive/1QTS3iMwQ5bOEgKDn5DrlmZ6V0rh5APdA?usp=sharing

Comparative Analysis of YiVal and DevOpsGPT: Unique Selling Points and Competitive Edges

Hello Contributors and Community,

I recently found that there are two very interesting projects, DevOpsGPT and YiVal, both of which are based on AI and specifically large language models, but seemingly aiming at two different aspects in the AI-ML deployment process. I wanted to open a discussion to understand the unique competitive advantages and features that YiVal holds in comparison to DevOpsGPT.

At the outset, let me provide a brief understanding of the two projects:

DevOpsGPT: An AI-Driven Software Development Automation Solution that combines Large Language Models with DevOps tools to convert natural language requirements into working software, thereby enhancing development efficiency and reducing communication costs. It generates code, performs validation, and can analyze existing project information and tasks.
YiVal: An GenAI-Ops framework aimed at iterative tuning of Generative AI model metadata, params, prompts, and retrieval configurations, tuned with test dataset generation, evaluation algorithms, and improvement strategies. It streamlines prompt development, supports multimedia and multimodel input, and offers automated prompt generation and prompt-related artifact configuration.

Looking at both of these, it seems they provide unique features to cater to different needs in the AI development and deployment pipeline. However, I'm curious to further understand the unique selling points and specific competitive advantages of YiVal.

Here are a few questions that might be worth discussing:

DevOpsGPT seems to convert natural language requirements into working software while YiVal seems focused on fine-tuning Generative AI with test dataset generation and improvement strategies. In what ways does YiVal outperform DevOpsGPT in facilitating a more robust and efficient machine learning model iteration and training process?
One of the highlighted features of YiVal is its focus on Human(RLHF) and algorithm-based improvers along with the inclusion of a detailed web view. Can you provide a bit more insight into how these features are leveraged in YiVal and how they compare to DevOpsGPT's project analysis and code generation features?
DevOpsGPT offers a feature to analyze existing projects and tasks, whereas YiVal emphasizes streamlining prompt development and multimedia/multimodel input. How does YiVal handle integration with existing models and datasets? Is there any scope for reverse-engineering or retraining established models with YiVal?
In terms of infrastructure, how does YiVal compare to DevOpsGPT? Do they need similar resources for deployment and operation, or does one offer more efficiency?
Lastly, how is the user experience on YiVal compared to DevOpsGPT? I see YiVal boasts a "non-code" experience for building Gen-AI applications, but how does this hold up against DevOpsGPT's efficient and understandable automated development process?

I'd appreciate any insights or thoughts on these points. Looking forward to stimulating discussions!

When running guardrill example, I got rate limit exceeded error

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:
When run this colab: https://colab.research.google.com/drive/1QgRQmFmC_L07Ler4vbq_vcCNm_OHmJL_#scrollTo=KnmaSTEc13Rg

!poetry run yival rub /gairdrails_leetcode.yml -- async_eval=True
See error:
"Rate limit exceeded, sleeping......"

Documentation

Move test to tests/ folder
Setup Issue category and permissions
Improve the contributing guide, add a section in the README.md
Fix mkdown lint

Demo Video in Readme

demo.mp4

Could your support human rating with multiple iterations?

I hope human rating could be embedded in multiple iterations, so that in each iteration I could optimize the prompt toward my goal.

[openai_prompt_based_evaluator] function: extract_choice_from_response easy to attack

def debug():
    input = """
    response_content:Step 1: Evaluate if the headline clearly communicates what the startup does or what problem it solves. The headline "Unlock the Power of Blockchain: The Ultimate Solution for Enhanced Security" does communicate that the startup is involved in blockchain technology and aims to provide enhanced security solutions.
    Step 2: Determine if it is immediately clear to anyone who reads the headline what the startup's purpose is. The headline does make it clear that the startup's purpose is to provide security solutions using blockchain technology.
    Step 3: Assess if there is any lack of clarity that can lead to confusion and may discourage potential users or investors. The headline is straightforward and does not seem to have any elements that could cause confusion.
    Conclusion: The headline meets the criterion very well.

    E
    """
    choice = extract_choice_from_response(input, ["A", "B", "C", "D", "E"])
    print(f"choice is now {choice}")


if __name__ == "__main__":
    debug()

result：

choice is now C

The function extract_choice_from_response is too vulnerable to attack, it triggers when the beginning is 'choices', so the actual extracted score may be incorrect.

I'll fix this problem next week after I read all code

Mem Maze

https://arxiv.org/pdf/2310.05029.pdf

Hosted Yival ?

Is your feature request related to a problem? Please describe.
I've been using YiVal for a couple days now, are you planning on building a hosted product ?

If yes - I'd love to help out (i'm the maintainer of LiteLLM https://github.com/BerriAI/litellm)

Comparing Unique Features and Competitive Advantages of YiVal and MetaGPT

Hello YiVal Team,

Firstly, congratulations on your fantastic work on the YiVal project. It is clear that this unique GenAI-Ops framework has been carefully designed with quality and utility in mind.

I have been utilizing your framework and appreciate the ability it offers to iteratively tune the Generative AI model metadata, parameters, prompts, and retrieval configurations. It's impressive that the users are allowed not only to select their test dataset generation and evaluation algorithms but also to choose the improvement strategies. This flexibility truly differentiates your work.

I've also been following the MetaGPT project, a multi-agent framework that empowers a GPT to operate within a software company, encouraging collaboration on more complex tasks. MetaGPT is particularly notable for its approach to orchestrate GPTs to carry out distinct roles within a software entity, transforming a one-line requirement into an extensive set of user stories, competitive analyses, requirements, APIs, and even data structures. It presents these outputs in its unique way of having GPTs fulfill roles equivalent to product managers, project managers, architects, and engineers. This recapitulation of a software company's operations and processes within the framework is interesting.

Given your familiarity with both YiVal and MetaGPT, I am curious to understand the prominent distinguishing features, key strengths, niche audience, or potential use cases, that YiVal offers relative to MetaGPT. What are the fundamental competitive advantages of YiVal over MetaGPT?

Looking forward to the clarifications and insights you can provide. Thank you for taking the time to address my query.

Best Regards,

Clear instruction to use Azure OpenAI LLM

Is your feature request related to a problem? Please describe.
I can't find how to use YiVal with an Azure OpenAI LLM in the doc or github

Describe the solution you'd like
A clear example for using Azure OpenAI LLM

Describe alternatives you've considered
I've seen this page : (

YiVal/demo/tutorial_notebook/model_comparison.ipynb

Line 21 in 6cbf2b2

"|--------------| ---- | ---- | ---- |--------------|\n",

)
But the examples are for using a local model.

yival / yival Goto Github PK

yival's People

Stargazers

Watchers

Forkers

yival's Issues

Recommend Projects

Recommend Topics

Recommend Org