Comments (13)
I tested this idea's effectiveness, manually, with a series of various style questions, and the final consolidated response was better than any of the individual model's response in all the tests. I suppose the model chosen to grade could have a bias toward its own model's response (if it was the model for one of responses). To try to combat that, I created a new thread each time for the evaluation, grading, consolidation process, and didn't state where the answers came from.
from big-agi.
I acknowledge this might get expensive for individuals and be slower, so not practical to do all the time
from big-agi.
@keithclift24 - are you able to build the main
branch? I've been developing this feature for 2 weeks, and it's HUGE and well done. I have the "exploration" of the space by using multi-models, but not the consolidation yet.
Please let me know quickly if you are able to build the main
branch, as I need to tell you how to enable it to test, it's under wraps.
from big-agi.
@keithclift24 I need help with the "response consolidation logic". Please let me know how you performed what I call the "merge" phase. I have a few ideas: replace system prompt, add all 1..N messages as Assistant messages and then have the user instructions, or have them all into 1 user message sandwiched between instructions, but have not selected the method(s) to do it.
I'll have the UI with max flexibility, even for custom, but I need a proven way of doing it. My investigation into Llamaindex and Langchain yielded nothing.
Please also reach out on Discord (you'll find me on the big-AGI server).
from big-agi.
@keithclift24 - are you able to build the
main
branch? I've been developing this feature for 2 weeks, and it's HUGE and well done. I have the "exploration" of the space by using multi-models, but not the consolidation yet.Please let me know quickly if you are able to build the
main
branch, as I need to tell you how to enable it to test, it's under wraps.
I really am a novice at coding, and never really worked on a serious project (besides playing around, learning), so I am afraid I'll be more trouble than help. I generated the to-do list above based off my "idea" and with GPT-4 helping me describe my suggested idea in terms you all may understand (so the requirements to do list could be jibberish for all I know, but looked relatively logical). Sorry, I'd love to help, but it's way over my head unfortunately.
from big-agi.
@keithclift24 I need help with the "response consolidation logic". Please let me know how you performed what I call the "merge" phase. I have a few ideas: replace system prompt, add all 1..N messages as Assistant messages and then have the user instructions, or have them all into 1 user message sandwiched between instructions, but have not selected the method(s) to do it.
I'll have the UI with max flexibility, even for custom, but I need a proven way of doing it. My investigation into Llamaindex and Langchain yielded nothing.
Please also reach out on Discord (you'll find me on the big-AGI server).
When I was talking about testing it manually above, I literally just asked a question of each GPT-4 Turbo, Claude 3 Opus, and Gemini Advanced, then copied the responses into a single .txt file on my PC and called them "Answer 1...Answer 2..Answer 3...". Then in a new thread asked GPT-4 Turbo "To the question "[Question i asked the 3 models]", I want you to objectively grade the attached 3 answers (you decide various logical criteria, but use a 1-100 scale)" and attached the .txt file saved on my PC. Then I asked to consolidate the 3 answers into one response with the best of each of the 3 answers and focus on further improving the final response based on resolving the "weaknesses" identified in the grading. (then for my own testing I asked to grade that result against the original 3 answers).
So when it comes to the most efficient way to have the big-agi.com code base accomplish this whole process (let alone the grading/merge), I don't have any advice, unfortunately.
from big-agi.
This is still insightful. And be ready to try it out very very soon, it's a huge feature, the UX is great, and days to be done.
Also please let me know if there's anything like this out there!
from big-agi.
This is still insightful. And be ready to try it out very very soon, it's a huge feature, the UX is great, and days to be done.
Also please let me know if there's anything like this out there!
Most definitely, I will! Love the project and look forward to tracking and maybe helping some.
After using the website off and on over that last 6 months a few other things I would find most useful (I'm sure you've heard these):
- Ability to login and keep your chat history, keys, settings and other data stored between devices.
- Ability to save more than one custom prompt/persona. As a janky but creative workaround, I have a chat folder on the site for prompts where I start a chat with a custom persona (prompt) as a template. Then branch from those to create new chats leaving the "templates" unchanged for future use.
from big-agi.
- Ability to login and keep your chat history, keys, settings and other data stored between devices.
- Ability to save more than one custom prompt/persona. As a janky but creative workaround, I have a chat folder on the site for prompts where I start a chat with a custom persona (prompt) as a template. Then branch from those to create new chats leaving the "templates" unchanged for future use.
For both we have tickets already.
from big-agi.
Hi @keithclift24 , are you on discord? This feature is released today to the first testers and I'd love you to take a look at this and give your opinion
from big-agi.
Hi @keithclift24 , are you on discord? This feature is released today to the first testers and I'd love you to take a look at this and give your opinion
Yes, username "kmc24", and I'm a member of the big-agi server
from big-agi.
@enricoros I think you're good to close but please merge with the "Chat: Best-Of N effect #381" and "BEAM - feature thread #470".
from big-agi.
@keithclift24 your suggestion here was seminal and great - and it's amazing we shipped fast. Your advice (and the community's) came just at the right time to be able to shape this. We can now think of what would make V2 great:
- improved prompting for generalization and better working with small models
- different encoding (model-dependent) of the history for the beam and merge phases
- option to ignore the history when beaming or merging
- ... ?
from big-agi.
Related Issues (20)
- [Roadmap] Autosaving all chats to a local file
- [BUG] Fail to pickup available model automatically for Azure OpenAI when GPT3.5 isn't avail. HOT 2
- Fetch times out on low end devices serving OLLAMA HOT 4
- [Roadmap] Continuation feature that benefits coders HOT 5
- [Roadmap] Support for DuckDuckGo search HOT 1
- OpenPipe support HOT 5
- Unified Framework for Multi-Providers support
- YouTube Transcriber: Error fetching transcript. Please try again. HOT 7
- [BUG] Small bug when using big-AGI 2 HOT 6
- [Roadmap] langfuse integration for prompt tracking / metrics
- Global Variables Definition and Documentation
- Per-Beam Local Context Variables in Beam
- [Roadmap] Support Anthropic's prompt caching feature HOT 8
- [Roadmap] Support ElevenLabs Custom API Endpoint
- [Roadmap] Integrate LLM output that contains code with third-party tools like repl.it HOT 2
- [BUG] Redirection to config-local-ollama.md is not working in environment-variables.md
- [BUG] Unable to Input Image to Any Model
- [BUG]
- Support Anthropic Claude model from GCP
- [BUG] Custom Personas Not Persisting After Reload HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from big-agi.