Best of the Best: Integrating multi-model responses with Peer Review to generate a con

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

[Suggestion] Leading Models Answer Consolidator! about big-agi HOT 13 CLOSED

keithclift24 commented on September 13, 2024 1

[Suggestion] Leading Models Answer Consolidator!

from big-agi.

Comments (13)

keithclift24 commented on September 13, 2024

I tested this idea's effectiveness, manually, with a series of various style questions, and the final consolidated response was better than any of the individual model's response in all the tests. I suppose the model chosen to grade could have a bias toward its own model's response (if it was the model for one of responses). To try to combat that, I created a new thread each time for the evaluation, grading, consolidation process, and didn't state where the answers came from.

from big-agi.

keithclift24 commented on September 13, 2024

I acknowledge this might get expensive for individuals and be slower, so not practical to do all the time

from big-agi.

enricoros commented on September 13, 2024

@keithclift24 - are you able to build the main branch? I've been developing this feature for 2 weeks, and it's HUGE and well done. I have the "exploration" of the space by using multi-models, but not the consolidation yet.

Please let me know quickly if you are able to build the main branch, as I need to tell you how to enable it to test, it's under wraps.

from big-agi.

enricoros commented on September 13, 2024

@keithclift24 I need help with the "response consolidation logic". Please let me know how you performed what I call the "merge" phase. I have a few ideas: replace system prompt, add all 1..N messages as Assistant messages and then have the user instructions, or have them all into 1 user message sandwiched between instructions, but have not selected the method(s) to do it.

I'll have the UI with max flexibility, even for custom, but I need a proven way of doing it. My investigation into Llamaindex and Langchain yielded nothing.

Please also reach out on Discord (you'll find me on the big-AGI server).

from big-agi.

keithclift24 commented on September 13, 2024

@keithclift24 - are you able to build the main branch? I've been developing this feature for 2 weeks, and it's HUGE and well done. I have the "exploration" of the space by using multi-models, but not the consolidation yet.

Please let me know quickly if you are able to build the main branch, as I need to tell you how to enable it to test, it's under wraps.

I really am a novice at coding, and never really worked on a serious project (besides playing around, learning), so I am afraid I'll be more trouble than help. I generated the to-do list above based off my "idea" and with GPT-4 helping me describe my suggested idea in terms you all may understand (so the requirements to do list could be jibberish for all I know, but looked relatively logical). Sorry, I'd love to help, but it's way over my head unfortunately.

from big-agi.

keithclift24 commented on September 13, 2024

@keithclift24 I need help with the "response consolidation logic". Please let me know how you performed what I call the "merge" phase. I have a few ideas: replace system prompt, add all 1..N messages as Assistant messages and then have the user instructions, or have them all into 1 user message sandwiched between instructions, but have not selected the method(s) to do it.

I'll have the UI with max flexibility, even for custom, but I need a proven way of doing it. My investigation into Llamaindex and Langchain yielded nothing.

Please also reach out on Discord (you'll find me on the big-AGI server).

When I was talking about testing it manually above, I literally just asked a question of each GPT-4 Turbo, Claude 3 Opus, and Gemini Advanced, then copied the responses into a single .txt file on my PC and called them "Answer 1...Answer 2..Answer 3...". Then in a new thread asked GPT-4 Turbo "To the question "[Question i asked the 3 models]", I want you to objectively grade the attached 3 answers (you decide various logical criteria, but use a 1-100 scale)" and attached the .txt file saved on my PC. Then I asked to consolidate the 3 answers into one response with the best of each of the 3 answers and focus on further improving the final response based on resolving the "weaknesses" identified in the grading. (then for my own testing I asked to grade that result against the original 3 answers).

So when it comes to the most efficient way to have the big-agi.com code base accomplish this whole process (let alone the grading/merge), I don't have any advice, unfortunately.

from big-agi.

enricoros commented on September 13, 2024

This is still insightful. And be ready to try it out very very soon, it's a huge feature, the UX is great, and days to be done.
Also please let me know if there's anything like this out there!

from big-agi.

keithclift24 commented on September 13, 2024

This is still insightful. And be ready to try it out very very soon, it's a huge feature, the UX is great, and days to be done.
Also please let me know if there's anything like this out there!

Most definitely, I will! Love the project and look forward to tracking and maybe helping some.

After using the website off and on over that last 6 months a few other things I would find most useful (I'm sure you've heard these):

Ability to login and keep your chat history, keys, settings and other data stored between devices.
Ability to save more than one custom prompt/persona. As a janky but creative workaround, I have a chat folder on the site for prompts where I start a chat with a custom persona (prompt) as a template. Then branch from those to create new chats leaving the "templates" unchanged for future use.

from big-agi.

enricoros commented on September 13, 2024

Ability to login and keep your chat history, keys, settings and other data stored between devices.

Ability to save more than one custom prompt/persona. As a janky but creative workaround, I have a chat folder on the site for prompts where I start a chat with a custom persona (prompt) as a template. Then branch from those to create new chats leaving the "templates" unchanged for future use.

For both we have tickets already.

from big-agi.

enricoros commented on September 13, 2024

Hi @keithclift24 , are you on discord? This feature is released today to the first testers and I'd love you to take a look at this and give your opinion

from big-agi.

keithclift24 commented on September 13, 2024

Hi @keithclift24 , are you on discord? This feature is released today to the first testers and I'd love you to take a look at this and give your opinion

Yes, username "kmc24", and I'm a member of the big-agi server

from big-agi.

keithclift24 commented on September 13, 2024

@enricoros I think you're good to close but please merge with the "Chat: Best-Of N effect #381" and "BEAM - feature thread #470".

from big-agi.

enricoros commented on September 13, 2024

@keithclift24 your suggestion here was seminal and great - and it's amazing we shipped fast. Your advice (and the community's) came just at the right time to be able to shape this. We can now think of what would make V2 great:

improved prompting for generalization and better working with small models
different encoding (model-dependent) of the history for the beam and merge phases
option to ignore the history when beaming or merging
... ?

from big-agi.

[Suggestion] Leading Models Answer Consolidator! about big-agi HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent