llm-evaluation-s-always-fatiguing / leaf-playground Goto Github PK

A framework to build scenario simulation projects where human and LLM based agents can participant in, with a user-friendly web UI to visualize simulation, support automatically evaluation on agent action level.

License: MIT License

Python 99.92% Dockerfile 0.08%

llm-evaluation agent-based-simulation automation evaluations agent agents chatgpt

leaf-playground's People

Contributors

Stargazers

Watchers

leaf-playground's Issues

Refinement Needed in the Location Definition of the action_exec_timeout Field

I have noticed that in the project, the action timeout duration is determined by the action_exec_time in SceneAgent. However, an agent might have multiple actions with differing timeout requirements. Under the current setup, this could lead to issues. To address this, I propose a solution: it would be more appropriate to place action_exec_time within ActionDefinition.

Below is my pull request regarding this solution. If you find it suitable, please consider merging it.

#40

[Roadmap] v1.0.0

Overview

This is a demonstration of the roadmap to v1.0.0.

The core objective of this project is to deliver a meticulously designed and sufficiently flexible framework, accompanied by a set of tools to assist developers in rapidly implementing simulation scenarios where multiple LLM agents can interact or compete to fulfill specific needs or tasks with minimal code. Simultaneously, the project incorporates pre-built, diverse simulation scenarios to enable developers to directly test the specific performance of their LLM agents within corresponding contexts, and to compare with other agents implemented by their own or the community.

By the time of the v1.0.0 release, this project will encompass the following features:

A highly abstract core framework with standardized protocols to creat scene projects.
A web service that is stable enough to concurrently running multiple scenario simulation tasks.
Develop scene projects as many as possible.
Implement popular LLM reasoning methods as many as possible.
Support popular LLM backends as many as possible.
Support popular prompting frameworks as many as possible.

Core Framework Implementation
Web Service Implementation
Scene Projects Development
LLM Reasoning Methods Implementations
LLM Backends Supporting
Prompting Frameworks Supporting

Core framework implementatioin

Implement a meticulously designed, highly abstract core framework where defines all the elements necessary for creating a scene project, providing standardized protocols that accurately identify all scene projects' components in accordance with the specified configurations.

(todo list here)

Web service implementatioin

Implement a stable, high-concurrency web service that offers a range of APIs that facilitate seamless interaction with leaf-playground-webui. It will operate in a containerized manner, concurrently executing multiple scenario simulation tasks.

(todo list here)

Scene projects development

Develop a multitude of scene projects that combine entertainment value and application value to meet various evaluation needs of community users. The results of simulation tasks generated by each scene project should effectively quantify the specific application skills and general abilities of LLM agents.

(todo list here)

LLM reasoning methods implementations

See #9 for more details.

LLM Backends supporting

Support a selection of mainstream LLM backends, and define communication protocols when necessary.

(todo list here)

Prompting frameworks supporting

See #9 for more details.

[Feature] Support popular prompting frameworks and initial implementation of popular reasoning strategies

There is a recently published paper A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future.

In addition to supporting mainstream LLM backends, the project should also encompass support for some popular prompting frameworks such as langchain, semantic kernel, prompt flow, textai, LlamaIndex, etc. The fundamental requirement is to offer straightforward wrappers for these frameworks, ensuring developers can seamlessly integrate them into our project. Additionally, efforts should be made to combine and encapsulate commonly used functionalities from these frameworks, minimizing redundancy in developers' work.

Simultaneously, the project should extensively implement useful LLM reasoning strategies and encapsulate them into functional modules. Ideally, these strategies should be packaged into individual tools, enabling LLMs to autonomously select and apply the appropriate strategy as needed.

(todo list here)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

llm-evaluation-s-always-fatiguing / leaf-playground Goto Github PK

leaf-playground's People

Contributors

Stargazers

Watchers

leaf-playground's Issues

Overview

Table of Contents

Core framework implementatioin

Web service implementatioin

Scene projects development

LLM reasoning methods implementations

LLM Backends supporting

Prompting frameworks supporting

Recommend Projects

Recommend Topics

Recommend Org