jointhealliance / bgent Goto Github PK
View Code? Open in Web Editor NEWFlexible, scalable and customizable agents to do your bidding.
Home Page: https://bgent.org
Flexible, scalable and customizable agents to do your bidding.
Home Page: https://bgent.org
We want to demonstrate bgent running on-chain. We have a cloudflare worker version, which has a similar state pattern. However, our agent is currently tied to supabase.
This kind of relies on these:
#53
#4
The final output is an agent running in a canister on the Internet Computer using Azle and bgent. We should be able to converse with the agent, and the agent should be able to perform search / RAG on documents.
Right now goals are half-implemented and not fully functional. This is one of the last pre-release alpha features we need for our first goal directed agents to be... goal directed.
We want to run things as minimally as possible. We should make sure our validation is working and not providing too many LLM options or causing LLM calls when we only have one option.
This is related to #14 -- possibly duplicate, but the important part here is that our providers, actions and evaluators are also not forcing a composeState unless they need new state, or are retrieving new state they need and adding it to the existing state object.
bgent.org is kinda ugly, needs design and mobile responsive cleanup
We have a Discord interactions handler bot, but we want a Discord.js bot example running
Right now we are re-composing state a lot, or making database calls for the same state multiple times in parallel.
We could optimize our performance and exec time by caching state with a system that ensures that we always have the latest cache.
composeState queries actors, events, facts, conversation, etc. It will only be altered by the addition of new messages. Most functions have an optional state input, but some of these are not set up properly and make compose the state again anyways.
It's important that on elaborate
and in the space between the user's message being saved and the agent responding that we update the conversation history and any other necessary state as needed locally while passing the state object through.
We should abstract our completion and embedding to handle different model providers, not just OpenAI.
Right now RLS is disabled on some of the tables. This is because we are calling postgres triggers which are not properly passing the authentication down, or the place it is being passed from is not right.
We need to look at how to fix the RLS rules so that we have strict RLS on everything.
This especially affects account creation.
Right now tests are not cleaning up properly, and leave some junk in the database. We should make sure that this is all cleaned up to maintain reliable testing.
We need some way to arbitrarily allow the user to add state data to their agent.
This can relate to our idea of "statuses" -- we need providers to inject context in at generation time.
This should inject into the state object and then get injected into the main context.
Discord bot is a cloudflare worker that runs to connect users to CJ
https://github.com/discord/cloudflare-sample-app
- Create new package in monorepo
- Deploy Discord bot to CF
- Configure secrets
- Test with routing to Runtime agent
- Test with "introduce" feature and storing users generally
Right now we are tied to Supabase. We want a local build option that isn't so tied into it
Kinda needs this: #4
Currently we don't have a temperature argument for AI completions. We should add this argument for OpenAI for now. Soon we will abstract this to a model provider.
const response = await runtime.completion({
context,
frequency_penalty,
temperature // add this
});
Add the introduce action
Embeddings cost money, and 20-30% of them are probably repeats of obvious things, like "hey whats up" (in the case of a memory embedding).
When inserting embeddings, do a check to see if there are any rows where the string is the same. Bonus points for levenstein distance check to string similarity. Then if that row returns and has a vector, use it instead of making a new vector.
Right now there is no way to turn off reflection evaluation. We should make the evaluators
field override this so that we can make bots that don't run evaluate.
We should add a section for Providers to the README
Right now we have a logger but it's... meh.
logs
tablelogs
folder with a filename valuelogs
Most agents need to have specific knowledge about their use case. We call this "lore" -- it could be game lore, or it could be the docs for your repo, or the business information for your company.
One of the features of bgent is that there is a facts
evaluator, which derives facts from the conversation. Recent facts are recalled, along with "relevant" facts -- i.e. facts that in some way relate to the current conversation.
Facts should have timestamp, timestamp should be turned into X minutes ago, X hours ago, X days ago
We should create a set of functions (or find a library) that turns time information into semantic info. For example, "3 days ago".
For display facts, we should try to not polute with too much "3 days ago", but sort the facts by date and have one header for each day, or "last month", or whatever looks clean and makes sense.
conditions are messages that can be placed in the context, with no additional action etc. they are evaluated every run
- user can introduce status, using introduce condition check
- should be flexible enough to think about user payment status conditions
We want to hold state data about users that is specific to an agent but broad for them across all channels.
A good example is credits. Creating a whole table for this isn't great and doesn't scale. A flexible JSON store can be used by many different plugins.
Right now we're focused on ChatGPT to get going, but LocalLlama support would be great. If we can just script that, awesome.
https://github.com/lgrammel/modelfusion
We employ a few concepts -- actions, evaluators, providers, context-- that might not be clear to people. We should clearly explain what these are and how they work in the documentation.
We are storing 3072 zero-dim vectors for messages, which we are currently not vectorizing. We should set these to be null.
Right now the template has things like "Actors in scene:" and this is ugly. We should have a function that adds header to state elements in the composeState function.
Right now we have a kind of messy situation with Content
where it's a string sometimes, and a JSON object other times.
Instead, we should unify that and make content always be a string. All other content moves to the metadata
field, also a JSONB column.
Right now our tests are not so great. We are scoring 50% on some.
To test, run 'npm run test'.
Here are the latest benchmarks
[
{
"testName": "Extract programmer and startup facts",
"attempts": 4,
"successful": 3,
"successRate": 75
},
{
"testName": "Extract married fact, ignoring known facts",
"attempts": 4,
"successful": 3,
"successRate": 75
},
{
"testName": "Run the evaluation process",
"attempts": 4,
"successful": 2,
"successRate": 50
},
{
"testName": "Expect ignore",
"attempts": 4,
"successful": 3,
"successRate": 75
},
{
"testName": "Test validate function response",
"attempts": 4,
"successful": 1,
"successRate": 25
}
]
]
We should try to get all of these to 100.
Right now it's kind of hard to set up a new instance, because the supabase stuff doesn't have a clear sync, and can't be run locally without a cloud setup (even though that's free.
supabase
CLI tool and integrationsupabase
local instanceFor IGNORE actions, we should not send the message to the user, and we should blank out the content or format appropriately for context. We should remove these ignore messages from the script if it makes sense.
Right now its pretty hard to max conversation length, but possible. We should add caps to prevent massive conversations and spam. We should also probably cap input length.
We shouldn't be tied to Supabase. Could also use something like Chroma for quick, just-works local deployment. And mongo people might want that. Swapping in a db adapter seems like the right call.
Right now the database setup is not clear, so making a new agent with bgent is confusing.
Right now some of the tests are flaky. A good example is data extraction. Instead of a single test, we should have a best-of-N testing setup for any AI responses with several evaluation points.
We can store individual responses (for example, the details extract returns name and gender but not age) as well as overall success rates.
We can't expect 100% success rate but having some benchmark or gradient for testing would make prompt engineering viable and far more efficient.
Right now we are tied to Supabase. We should also offer a Firebase option.
Right now we are 503ing on the tests because they are blasting too fast. We need to rate limit somewhat so we don't get crushed.
Add the rolodex provider. Should provider the most relevant users based on the profile of the current user.
Right now we are tied to Supabase. We should also offer a Mongo option.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.