jointhealliance / bgent Goto Github PK

Flexible, scalable and customizable agents to do your bidding.

JavaScript 4.87% TypeScript 86.35% PLpgSQL 8.71% Shell 0.06%

bgent's Issues

bgent Internet Computer example

We want to demonstrate bgent running on-chain. We have a cloudflare worker version, which has a similar state pattern. However, our agent is currently tied to supabase.

This kind of relies on these:
#53
#4

Create new project example using Azle on the IC and add bgent to the project
Create new IC database provider and MemoryManager class that is IC compatible
Abstract all function calls that rely on supabase so they can be overridden
Look into sqlite and sqlite-vss with Azle
-> https://github.com/demergent-labs/azle/tree/main/examples/sqlite/src and https://github.com/asg017/sqlite-vss
Create a database provider abstraction for Internet Computer
-> Similar to these #53 and #4
Get agent running on the internet computer with conversational input
-> Example of agent running in Cloudflare: https://github.com/ClioDynamics/cojourney/blob/main/packages/agent/src/index.ts
Make sure vector search works
-> Elna makes a vector db for IC: https://github.com/elna-ai/elna-vector-db
-> Vector db for sqlite https://github.com/asg017/sqlite-vss
-> https://github.com/arcmindai/arcmindai Arcmind makes a vector db canister which would be used / modified

The final output is an agent running in a canister on the Internet Computer using Azle and bgent. We should be able to converse with the agent, and the agent should be able to perform search / RAG on documents.

Implement goals

Right now goals are half-implemented and not fully functional. This is one of the last pre-release alpha features we need for our first goal directed agents to be... goal directed.

finish implementation
add update goal action
add update objective action
add examples
add tests

Update providers, evaluators and actions to make sure they are optimized for validation

We want to run things as minimally as possible. We should make sure our validation is working and not providing too many LLM options or causing LLM calls when we only have one option.

This is related to #14 -- possibly duplicate, but the important part here is that our providers, actions and evaluators are also not forcing a composeState unless they need new state, or are retrieving new state they need and adding it to the existing state object.

Design bgent.org

bgent.org is kinda ugly, needs design and mobile responsive cleanup

Discord.js Example - Eliza

We have a Discord interactions handler bot, but we want a Discord.js bot example running

Deploy to AWS on a free EC2 instance with CI/CD
Create Github example project
Make sure lore is properly consumed from base docs in CI/CD
Add Elevenlabs voice integration and joining / leaving voice channels
Implement "introduce" and "rolodex" with discord users
Behavior and configurability for running in channels
DM handling
Algorithm to make the agent not annoying

Optimize state handling and passing

Right now we are re-composing state a lot, or making database calls for the same state multiple times in parallel.

We could optimize our performance and exec time by caching state with a system that ensures that we always have the latest cache.

composeState queries actors, events, facts, conversation, etc. It will only be altered by the addition of new messages. Most functions have an optional state input, but some of these are not set up properly and make compose the state again anyways.

It's important that on elaborate and in the space between the user's message being saved and the agent responding that we update the conversation history and any other necessary state as needed locally while passing the state object through.

ModelProvider abstraction

We should abstract our completion and embedding to handle different model providers, not just OpenAI.

Move OpenAI to ModelProvider abstraction
Add together.xyz and mistral example

Fix RLS rules

Right now RLS is disabled on some of the tables. This is because we are calling postgres triggers which are not properly passing the authentication down, or the place it is being passed from is not right.

We need to look at how to fix the RLS rules so that we have strict RLS on everything.

This especially affects account creation.

Make sure all tests are cleaning up

Right now tests are not cleaning up properly, and leave some junk in the database. We should make sure that this is all cleaned up to maintain reliable testing.

"Provider" abstraction

We need some way to arbitrarily allow the user to add state data to their agent.

This can relate to our idea of "statuses" -- we need providers to inject context in at generation time.

This should inject into the state object and then get injected into the main context.

Discord bot example

Discord bot is a cloudflare worker that runs to connect users to CJ
https://github.com/discord/cloudflare-sample-app
- Create new package in monorepo
- Deploy Discord bot to CF
- Configure secrets
- Test with routing to Runtime agent
- Test with "introduce" feature and storing users generally

Sqlite + Sqlite VSS Adapter

Right now we are tied to Supabase. We want a local build option that isn't so tied into it

Abstract Supabase adapter
Add a sqlite db adapter
Add docs for quickstart locally
Make developer experience for getting started ideal with sqlite, i.e. no db setup at all

Kinda needs this: #4

Add temperature

Currently we don't have a temperature argument for AI completions. We should add this argument for OpenAI for now. Soon we will abstract this to a model provider.

const response = await runtime.completion({
context,
frequency_penalty,
temperature // add this
});

"Introduce" action

Add the introduce action

search for most likely in rolodex
make connection
add / test rolodex handling of people
add profiles for other users
perform search for users based on descrpitinos
if a user gets introduces, put it in the chat stream as Agent introduced User to UserB
add test conditions where introduction should be made
add test conditions where introduction should fail
introduce condition check
- user has completed ftu goals
- user has sent > 30 messages

Embedding caching

Embeddings cost money, and 20-30% of them are probably repeats of obvious things, like "hey whats up" (in the case of a memory embedding).

When inserting embeddings, do a check to see if there are any rows where the string is the same. Bonus points for levenstein distance check to string similarity. Then if that row returns and has a vector, use it instead of making a new vector.

Evaluators override default evaluators

Right now there is no way to turn off reflection evaluation. We should make the evaluators field override this so that we can make bots that don't run evaluate.

Add providers to README

We should add a section for Providers to the README

Improve logger

Right now we have a logger but it's... meh.

Make sure logger.log looks nice with a nice frame, right now a little janky
Make sure colors work and look nice everywhere
logger.db should write to logs table
logger.file should write a file to the logs folder with a filename value
Migrate existing calls to logs

"Lore" handling

Most agents need to have specific knowledge about their use case. We call this "lore" -- it could be game lore, or it could be the docs for your repo, or the business information for your company.

Add easy way to split documents and upload lore
Add lore search to context
Add lore examples for Cojourney agent and auto-seeding

Add timestampts to facts

One of the features of bgent is that there is a facts evaluator, which derives facts from the conversation. Recent facts are recalled, along with "relevant" facts -- i.e. facts that in some way relate to the current conversation.

Facts should have timestamp, timestamp should be turned into X minutes ago, X hours ago, X days ago

We should create a set of functions (or find a library) that turns time information into semantic info. For example, "3 days ago".

For display facts, we should try to not polute with too much "3 days ago", but sort the facts by date and have one header for each day, or "last month", or whatever looks clean and makes sense.

Add functions to convert timestamp to "x minutes/days/weeks/months ago" (probably a library on npm for this)
Add this to the formatting of facts that are returned
Test this with facts to verify that relevant facts from long ago are indeed brought up

Condition Statuses

conditions are messages that can be placed in the context, with no additional action etc. they are evaluated every run
- user can introduce status, using introduce condition check
- should be flexible enough to think about user payment status conditions

give agent awareness of current time

Global Settings + User Data

We want to hold state data about users that is specific to an agent but broad for them across all channels.

A good example is credits. Creating a whole table for this isn't great and doesn't scale. A flexible JSON store can be used by many different plugins.

Implement this
Migrate "credits" to this

Local model support

Right now we're focused on ChatGPT to get going, but LocalLlama support would be great. If we can just script that, awesome.

Interoperability between ai models

https://github.com/lgrammel/modelfusion

implement this and let user pick what models they have
have to update .env files for keys probably as well

Add "Concepts" to docs

We employ a few concepts -- actions, evaluators, providers, context-- that might not be clear to people. We should clearly explain what these are and how they work in the documentation.

Remove empty [0...0] vectors for messages

We are storing 3072 zero-dim vectors for messages, which we are currently not vectorizing. We should set these to be null.

Move template section headers into context injector function

Right now the template has things like "Actors in scene:" and this is ugly. We should have a function that adds header to state elements in the composeState function.

Add 'metadata' jsonb to memories table and refactor actions into metadata, refactor Content and Metadata

Right now we have a kind of messy situation with Content where it's a string sometimes, and a JSON object other times.

Instead, we should unify that and make content always be a string. All other content moves to the metadata field, also a JSONB column.

Improve test success rate

Right now our tests are not so great. We are scoring 50% on some.

To test, run 'npm run test'.

Here are the latest benchmarks

[
  {
    "testName": "Extract programmer and startup facts",
    "attempts": 4,
    "successful": 3,
    "successRate": 75
  },
  {
    "testName": "Extract married fact, ignoring known facts",
    "attempts": 4,
    "successful": 3,
    "successRate": 75
  },
  {
    "testName": "Run the evaluation process",
    "attempts": 4,
    "successful": 2,
    "successRate": 50
  },
  {
    "testName": "Expect ignore",
    "attempts": 4,
    "successful": 3,
    "successRate": 75
  },
  {
    "testName": "Test validate function response",
    "attempts": 4,
    "successful": 1,
    "successRate": 25
  }
]
]

We should try to get all of these to 100.

Supabase local and supabase sync

Right now it's kind of hard to set up a new instance, because the supabase stuff doesn't have a clear sync, and can't be run locally without a cloud setup (even though that's free.

Add supabase CLI tool and integration
Sync current tables, functions, triggers, types and seeds from the current DB
Process and documentation to population new supabase local instance
Process and documentation for syncing new Supabase cloud instance

Blank out message for IGNORE, send and record

For IGNORE actions, we should not send the message to the user, and we should blank out the content or format appropriately for context. We should remove these ignore messages from the script if it makes sense.

Cap tokens in conversation

Right now its pretty hard to max conversation length, but possible. We should add caps to prevent massive conversations and spam. We should also probably cap input length.

Move supabase to adapter

We shouldn't be tied to Supabase. Could also use something like Chroma for quick, just-works local deployment. And mongo people might want that. Swapping in a db adapter seems like the right call.

Move all supabase calls into SupabaseDBClient adapter with generic functions
Make all functions extend a base DBClient class
Inject SupabaseDBClient into BgentRuntime instance

Easy database setup and handling

Right now the database setup is not clear, so making a new agent with bgent is confusing.

Set up supabase local (see other issues)
Docs for setting up a new Supabase cloud instance
automate and save scripts for dumping and syncing

Threshold testing

Right now some of the tests are flaky. A good example is data extraction. Instead of a single test, we should have a best-of-N testing setup for any AI responses with several evaluation points.

We can store individual responses (for example, the details extract returns name and gender but not age) as well as overall success rates.

We can't expect 100% success rate but having some benchmark or gradient for testing would make prompt engineering viable and far more efficient.

Firebase Adapter

Right now we are tied to Supabase. We should also offer a Firebase option.

Add Firebase adapter
Add docs for setup
Integrate and test with Google Vertex AI extension

Add Firebase adapter
Add docs for quickstart locally
Integrate and test with vector search

jointhealliance / bgent Goto Github PK

bgent's Issues

Recommend Projects

Recommend Topics

Recommend Org