Comments (2)
Currently, all documents uploaded into the rabbit hole are utilized to generate a user's response. However, in certain situations, there's a need to prioritize specific documents, especially when working with a large number of documents. This is where metadata can play a crucial role in facilitating this process.
I propose two necessary actions:
- Introduce the capability to set metadata during the document upload API process;
Can be done via before_rabbithole_stores_documents
or before_rabbithole_insert_memory
hooks, not yet directly via API, have a look here:
Line 312 in c79272b
The difference is, in the first hook you access all the chunks and can change them (included metata), summarize them, delete them.
In the second you deal with a single chunk.
In those chunk metadata you should use the same key you will use recall side (see below)
- Introduce the capability to apply filters when a user sends a message.
You can do it via hook, using before_cat_recalls_<collection_name>_memories
.
You can see how it is used in Cat Advanced Tools plugin
core/core/cat/looking_glass/stray_cat.py
Line 220 in c79272b
In the hook you should be able to add "metadata": {....}
to the recall config and that stuff should filter the retrieval, because it ends up here:
If necessary the metadata you pass can be obtained via websocket, accessible via cat.user_message_json.xxx
, or as an output of an entity extraction chain (that you can also place in the hook).
Let me know if all clear!
Thanks
from core.
Everything is clear @pieroit , thank you. However, using a plugin to set metadata implies that I won't have the ability to customize values depending on the file I'm uploading. Correct?
For the project I'm working on, we're uploading various documents categorized by type. If I create a plugin to set the documents' metadata, I would need to change the plugin's settings with every upload. On the other hand, if I could set the metadata directly via API, the process would be much faster.
Regarding filtering, I believe I've got it. I had modified the core of Cheshire Cat to automatically handle filters based on user messages, but perhaps this can also be achieved through a plugin. I'll give it a try. Thank you!
from core.
Related Issues (20)
- [BUG] TypeError: '_io.BufferedRandom' HOT 3
- BUG @hook After_Rabbit_Hole_splitted_text HOT 5
- [Security] Add Dependency bot HOT 1
- [Code] Add python linter HOT 2
- [Refactor] Abstract Vector Memory and simple api from plugins HOT 5
- [Feature] Bash install scripts set for the cat HOT 4
- [Feature] Fallback to handler pip errors HOT 1
- [BUG] using Gemini LLM doesn't work in main branch HOT 2
- [BUG] Experimental message endpoint is broken
- [BUG] Using Gemini LLM allows you to just get one result HOT 1
- [BUG] Hooks with same priority collide HOT 3
- [Feature] Add hook Before Websocket Connection Is Accepted HOT 1
- [Feature]: Add groq llms HOT 1
- [BUG] Interface does not return error on not pdf files HOT 1
- [Feature] WhiteRabbit expansion HOT 2
- [ERROR] AsyncCompletions.create() got an unexpected keyword argument 'api_type'" HOT 2
- [BUG] Cohere API requires array of strings for stop sequence HOT 2
- Close ws connections inactive for more than x seconds
- CheshireCat with Ollama - Application startup failed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from core.