Code Monkey home page Code Monkey logo

bgent's Issues

bgent Internet Computer example

We want to demonstrate bgent running on-chain. We have a cloudflare worker version, which has a similar state pattern. However, our agent is currently tied to supabase.

This kind of relies on these:
#53
#4

The final output is an agent running in a canister on the Internet Computer using Azle and bgent. We should be able to converse with the agent, and the agent should be able to perform search / RAG on documents.

Implement goals

Right now goals are half-implemented and not fully functional. This is one of the last pre-release alpha features we need for our first goal directed agents to be... goal directed.

  • finish implementation
  • add update goal action
  • add update objective action
  • add examples
  • add tests

Update providers, evaluators and actions to make sure they are optimized for validation

We want to run things as minimally as possible. We should make sure our validation is working and not providing too many LLM options or causing LLM calls when we only have one option.

This is related to #14 -- possibly duplicate, but the important part here is that our providers, actions and evaluators are also not forcing a composeState unless they need new state, or are retrieving new state they need and adding it to the existing state object.

Design bgent.org

bgent.org is kinda ugly, needs design and mobile responsive cleanup

Discord.js Example - Eliza

We have a Discord interactions handler bot, but we want a Discord.js bot example running

  • Deploy to AWS on a free EC2 instance with CI/CD
  • Create Github example project
  • Make sure lore is properly consumed from base docs in CI/CD
  • Add Elevenlabs voice integration and joining / leaving voice channels
  • Implement "introduce" and "rolodex" with discord users
  • Behavior and configurability for running in channels
  • DM handling
  • Algorithm to make the agent not annoying

Optimize state handling and passing

Right now we are re-composing state a lot, or making database calls for the same state multiple times in parallel.

We could optimize our performance and exec time by caching state with a system that ensures that we always have the latest cache.

Screenshot 2024-03-08 at 3 36 33 AM

composeState queries actors, events, facts, conversation, etc. It will only be altered by the addition of new messages. Most functions have an optional state input, but some of these are not set up properly and make compose the state again anyways.

It's important that on elaborate and in the space between the user's message being saved and the agent responding that we update the conversation history and any other necessary state as needed locally while passing the state object through.

ModelProvider abstraction

We should abstract our completion and embedding to handle different model providers, not just OpenAI.

  • Move OpenAI to ModelProvider abstraction
  • Add together.xyz and mistral example

Fix RLS rules

Right now RLS is disabled on some of the tables. This is because we are calling postgres triggers which are not properly passing the authentication down, or the place it is being passed from is not right.

We need to look at how to fix the RLS rules so that we have strict RLS on everything.

This especially affects account creation.

Make sure all tests are cleaning up

Right now tests are not cleaning up properly, and leave some junk in the database. We should make sure that this is all cleaned up to maintain reliable testing.

"Provider" abstraction

We need some way to arbitrarily allow the user to add state data to their agent.

This can relate to our idea of "statuses" -- we need providers to inject context in at generation time.

This should inject into the state object and then get injected into the main context.

Sqlite + Sqlite VSS Adapter

Right now we are tied to Supabase. We want a local build option that isn't so tied into it

  • Abstract Supabase adapter
  • Add a sqlite db adapter
  • Add docs for quickstart locally
  • Make developer experience for getting started ideal with sqlite, i.e. no db setup at all

Kinda needs this: #4

Add temperature

Currently we don't have a temperature argument for AI completions. We should add this argument for OpenAI for now. Soon we will abstract this to a model provider.

const response = await runtime.completion({
context,
frequency_penalty,
temperature // add this
});

"Introduce" action

Add the introduce action

  • search for most likely in rolodex
  • make connection
  • add / test rolodex handling of people
  • add profiles for other users
  • perform search for users based on descrpitinos
  • if a user gets introduces, put it in the chat stream as Agent introduced User to UserB
  • add test conditions where introduction should be made
  • add test conditions where introduction should fail
  • introduce condition check
    • user has completed ftu goals
    • user has sent > 30 messages

Embedding caching

Embeddings cost money, and 20-30% of them are probably repeats of obvious things, like "hey whats up" (in the case of a memory embedding).

When inserting embeddings, do a check to see if there are any rows where the string is the same. Bonus points for levenstein distance check to string similarity. Then if that row returns and has a vector, use it instead of making a new vector.

Evaluators override default evaluators

Right now there is no way to turn off reflection evaluation. We should make the evaluators field override this so that we can make bots that don't run evaluate.

Improve logger

Right now we have a logger but it's... meh.

  • Make sure logger.log looks nice with a nice frame, right now a little janky
  • Make sure colors work and look nice everywhere
  • logger.db should write to logs table
  • logger.file should write a file to the logs folder with a filename value
  • Migrate existing calls to logs

"Lore" handling

Most agents need to have specific knowledge about their use case. We call this "lore" -- it could be game lore, or it could be the docs for your repo, or the business information for your company.

  • Add easy way to split documents and upload lore
  • Add lore search to context
  • Add lore examples for Cojourney agent and auto-seeding

Add timestampts to facts

One of the features of bgent is that there is a facts evaluator, which derives facts from the conversation. Recent facts are recalled, along with "relevant" facts -- i.e. facts that in some way relate to the current conversation.

Facts should have timestamp, timestamp should be turned into X minutes ago, X hours ago, X days ago

We should create a set of functions (or find a library) that turns time information into semantic info. For example, "3 days ago".

For display facts, we should try to not polute with too much "3 days ago", but sort the facts by date and have one header for each day, or "last month", or whatever looks clean and makes sense.

  • Add functions to convert timestamp to "x minutes/days/weeks/months ago" (probably a library on npm for this)
  • Add this to the formatting of facts that are returned
  • Test this with facts to verify that relevant facts from long ago are indeed brought up

Condition Statuses

conditions are messages that can be placed in the context, with no additional action etc. they are evaluated every run
- user can introduce status, using introduce condition check
- should be flexible enough to think about user payment status conditions

  • give agent awareness of current time

Global Settings + User Data

We want to hold state data about users that is specific to an agent but broad for them across all channels.

A good example is credits. Creating a whole table for this isn't great and doesn't scale. A flexible JSON store can be used by many different plugins.

  • Implement this
  • Migrate "credits" to this

Local model support

Right now we're focused on ChatGPT to get going, but LocalLlama support would be great. If we can just script that, awesome.

Add "Concepts" to docs

We employ a few concepts -- actions, evaluators, providers, context-- that might not be clear to people. We should clearly explain what these are and how they work in the documentation.

Improve test success rate

Right now our tests are not so great. We are scoring 50% on some.

To test, run 'npm run test'.

Here are the latest benchmarks

[
  {
    "testName": "Extract programmer and startup facts",
    "attempts": 4,
    "successful": 3,
    "successRate": 75
  },
  {
    "testName": "Extract married fact, ignoring known facts",
    "attempts": 4,
    "successful": 3,
    "successRate": 75
  },
  {
    "testName": "Run the evaluation process",
    "attempts": 4,
    "successful": 2,
    "successRate": 50
  },
  {
    "testName": "Expect ignore",
    "attempts": 4,
    "successful": 3,
    "successRate": 75
  },
  {
    "testName": "Test validate function response",
    "attempts": 4,
    "successful": 1,
    "successRate": 25
  }
]
]

We should try to get all of these to 100.

Supabase local and supabase sync

Right now it's kind of hard to set up a new instance, because the supabase stuff doesn't have a clear sync, and can't be run locally without a cloud setup (even though that's free.

  • Add supabase CLI tool and integration
  • Sync current tables, functions, triggers, types and seeds from the current DB
  • Process and documentation to population new supabase local instance
  • Process and documentation for syncing new Supabase cloud instance

Blank out message for IGNORE, send and record

For IGNORE actions, we should not send the message to the user, and we should blank out the content or format appropriately for context. We should remove these ignore messages from the script if it makes sense.

Cap tokens in conversation

Right now its pretty hard to max conversation length, but possible. We should add caps to prevent massive conversations and spam. We should also probably cap input length.

Move supabase to adapter

We shouldn't be tied to Supabase. Could also use something like Chroma for quick, just-works local deployment. And mongo people might want that. Swapping in a db adapter seems like the right call.

  • Move all supabase calls into SupabaseDBClient adapter with generic functions
  • Make all functions extend a base DBClient class
  • Inject SupabaseDBClient into BgentRuntime instance

Easy database setup and handling

Right now the database setup is not clear, so making a new agent with bgent is confusing.

  • Set up supabase local (see other issues)
  • Docs for setting up a new Supabase cloud instance
  • automate and save scripts for dumping and syncing

Threshold testing

Right now some of the tests are flaky. A good example is data extraction. Instead of a single test, we should have a best-of-N testing setup for any AI responses with several evaluation points.

We can store individual responses (for example, the details extract returns name and gender but not age) as well as overall success rates.

We can't expect 100% success rate but having some benchmark or gradient for testing would make prompt engineering viable and far more efficient.

Firebase Adapter

Right now we are tied to Supabase. We should also offer a Firebase option.

  • Add Firebase adapter
  • Add docs for setup
  • Integrate and test with Google Vertex AI extension

Add rate limiting to runAiTest

Right now we are 503ing on the tests because they are blasting too fast. We need to rate limit somewhat so we don't get crushed.

"Rolodex" provider

Add the rolodex provider. Should provider the most relevant users based on the profile of the current user.

MongoDB Adapter

Right now we are tied to Supabase. We should also offer a Mongo option.

  • Add Firebase adapter
  • Add docs for quickstart locally
  • Integrate and test with vector search

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.