[Feature] Using metadata to filter declarative memory enhances response accuracy about core HOT 2 OPEN

Fede91 commented on June 9, 2024 1

[Feature] Using metadata to filter declarative memory enhances response accuracy

from core.

Comments (2)

pieroit commented on June 9, 2024 1

Currently, all documents uploaded into the rabbit hole are utilized to generate a user's response. However, in certain situations, there's a need to prioritize specific documents, especially when working with a large number of documents. This is where metadata can play a crucial role in facilitating this process.

I propose two necessary actions:

Introduce the capability to set metadata during the document upload API process;

Can be done via before_rabbithole_stores_documents or before_rabbithole_insert_memory hooks, not yet directly via API, have a look here:

core/core/cat/rabbit_hole.py

Line 312 in c79272b

def store_documents(self, stray, docs: List[Document], source: str) -> None:

The difference is, in the first hook you access all the chunks and can change them (included metata), summarize them, delete them.
In the second you deal with a single chunk.
In those chunk metadata you should use the same key you will use recall side (see below)

Introduce the capability to apply filters when a user sends a message.

You can do it via hook, using before_cat_recalls_<collection_name>_memories.
You can see how it is used in Cat Advanced Tools plugin

core/core/cat/looking_glass/stray_cat.py

Line 220 in c79272b

# hooks to change recall configs for each memory

In the hook you should be able to add "metadata": {....} to the recall config and that stuff should filter the retrieval, because it ends up here:

core/core/cat/memory/vector_memory_collection.py

Line 235 in c79272b

query_filter=self._qdrant_filter_from_dict(metadata),

If necessary the metadata you pass can be obtained via websocket, accessible via cat.user_message_json.xxx, or as an output of an entity extraction chain (that you can also place in the hook).

Let me know if all clear!
Thanks

from core.

Fede91 commented on June 9, 2024

Everything is clear @pieroit , thank you. However, using a plugin to set metadata implies that I won't have the ability to customize values depending on the file I'm uploading. Correct?

For the project I'm working on, we're uploading various documents categorized by type. If I create a plugin to set the documents' metadata, I would need to change the plugin's settings with every upload. On the other hand, if I could set the metadata directly via API, the process would be much faster.

Regarding filtering, I believe I've got it. I had modified the core of Cheshire Cat to automatically handle filters based on user messages, but perhaps this can also be achieved through a plugin. I'll give it a try. Thank you!

from core.

[Feature] Using metadata to filter declarative memory enhances response accuracy about core HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent