Code Monkey home page Code Monkey logo

Comments (13)

keithclift24 avatar keithclift24 commented on September 13, 2024

I tested this idea's effectiveness, manually, with a series of various style questions, and the final consolidated response was better than any of the individual model's response in all the tests. I suppose the model chosen to grade could have a bias toward its own model's response (if it was the model for one of responses). To try to combat that, I created a new thread each time for the evaluation, grading, consolidation process, and didn't state where the answers came from.

from big-agi.

keithclift24 avatar keithclift24 commented on September 13, 2024

I acknowledge this might get expensive for individuals and be slower, so not practical to do all the time

from big-agi.

enricoros avatar enricoros commented on September 13, 2024

@keithclift24 - are you able to build the main branch? I've been developing this feature for 2 weeks, and it's HUGE and well done. I have the "exploration" of the space by using multi-models, but not the consolidation yet.

Please let me know quickly if you are able to build the main branch, as I need to tell you how to enable it to test, it's under wraps.

from big-agi.

enricoros avatar enricoros commented on September 13, 2024

@keithclift24 I need help with the "response consolidation logic". Please let me know how you performed what I call the "merge" phase. I have a few ideas: replace system prompt, add all 1..N messages as Assistant messages and then have the user instructions, or have them all into 1 user message sandwiched between instructions, but have not selected the method(s) to do it.

I'll have the UI with max flexibility, even for custom, but I need a proven way of doing it. My investigation into Llamaindex and Langchain yielded nothing.

Please also reach out on Discord (you'll find me on the big-AGI server).

from big-agi.

keithclift24 avatar keithclift24 commented on September 13, 2024

@keithclift24 - are you able to build the main branch? I've been developing this feature for 2 weeks, and it's HUGE and well done. I have the "exploration" of the space by using multi-models, but not the consolidation yet.

Please let me know quickly if you are able to build the main branch, as I need to tell you how to enable it to test, it's under wraps.

I really am a novice at coding, and never really worked on a serious project (besides playing around, learning), so I am afraid I'll be more trouble than help. I generated the to-do list above based off my "idea" and with GPT-4 helping me describe my suggested idea in terms you all may understand (so the requirements to do list could be jibberish for all I know, but looked relatively logical). Sorry, I'd love to help, but it's way over my head unfortunately.

from big-agi.

keithclift24 avatar keithclift24 commented on September 13, 2024

@keithclift24 I need help with the "response consolidation logic". Please let me know how you performed what I call the "merge" phase. I have a few ideas: replace system prompt, add all 1..N messages as Assistant messages and then have the user instructions, or have them all into 1 user message sandwiched between instructions, but have not selected the method(s) to do it.

I'll have the UI with max flexibility, even for custom, but I need a proven way of doing it. My investigation into Llamaindex and Langchain yielded nothing.

Please also reach out on Discord (you'll find me on the big-AGI server).

When I was talking about testing it manually above, I literally just asked a question of each GPT-4 Turbo, Claude 3 Opus, and Gemini Advanced, then copied the responses into a single .txt file on my PC and called them "Answer 1...Answer 2..Answer 3...". Then in a new thread asked GPT-4 Turbo "To the question "[Question i asked the 3 models]", I want you to objectively grade the attached 3 answers (you decide various logical criteria, but use a 1-100 scale)" and attached the .txt file saved on my PC. Then I asked to consolidate the 3 answers into one response with the best of each of the 3 answers and focus on further improving the final response based on resolving the "weaknesses" identified in the grading. (then for my own testing I asked to grade that result against the original 3 answers).

So when it comes to the most efficient way to have the big-agi.com code base accomplish this whole process (let alone the grading/merge), I don't have any advice, unfortunately.

from big-agi.

enricoros avatar enricoros commented on September 13, 2024

This is still insightful. And be ready to try it out very very soon, it's a huge feature, the UX is great, and days to be done.
Also please let me know if there's anything like this out there!

from big-agi.

keithclift24 avatar keithclift24 commented on September 13, 2024

This is still insightful. And be ready to try it out very very soon, it's a huge feature, the UX is great, and days to be done.
Also please let me know if there's anything like this out there!

Most definitely, I will! Love the project and look forward to tracking and maybe helping some.

After using the website off and on over that last 6 months a few other things I would find most useful (I'm sure you've heard these):

  1. Ability to login and keep your chat history, keys, settings and other data stored between devices.
  2. Ability to save more than one custom prompt/persona. As a janky but creative workaround, I have a chat folder on the site for prompts where I start a chat with a custom persona (prompt) as a template. Then branch from those to create new chats leaving the "templates" unchanged for future use.

from big-agi.

enricoros avatar enricoros commented on September 13, 2024
  • Ability to login and keep your chat history, keys, settings and other data stored between devices.
  • Ability to save more than one custom prompt/persona. As a janky but creative workaround, I have a chat folder on the site for prompts where I start a chat with a custom persona (prompt) as a template. Then branch from those to create new chats leaving the "templates" unchanged for future use.

For both we have tickets already.

from big-agi.

enricoros avatar enricoros commented on September 13, 2024

Hi @keithclift24 , are you on discord? This feature is released today to the first testers and I'd love you to take a look at this and give your opinion

from big-agi.

keithclift24 avatar keithclift24 commented on September 13, 2024

Hi @keithclift24 , are you on discord? This feature is released today to the first testers and I'd love you to take a look at this and give your opinion

Yes, username "kmc24", and I'm a member of the big-agi server

from big-agi.

keithclift24 avatar keithclift24 commented on September 13, 2024

@enricoros I think you're good to close but please merge with the "Chat: Best-Of N effect #381" and "BEAM - feature thread #470".

from big-agi.

enricoros avatar enricoros commented on September 13, 2024

@keithclift24 your suggestion here was seminal and great - and it's amazing we shipped fast. Your advice (and the community's) came just at the right time to be able to shape this. We can now think of what would make V2 great:

  • improved prompting for generalization and better working with small models
  • different encoding (model-dependent) of the history for the beam and merge phases
  • option to ignore the history when beaming or merging
  • ... ?

from big-agi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.