Use more deeply langchain routines to keep the prompt at limited length (CombineDcoume

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I closed issue <a class="issue-link js-issue-link" data-error-text="Failed to load tit

PR <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

prompts get too large about core HOT 5 CLOSED

cheshire-cat-ai commented on May 18, 2024

prompts get too large

from core.

Comments (5)

pieroit commented on May 18, 2024 1

@calebgcc I introduced in the rabbit_hole a TextSplitter that can be customized (chunk_size and chunk_overlap). So Cat users can decide themselves how long they want their text chunks.

You find in docs here a list of langchain Documents (which is just an object with text and metadata) to experiment with file summarization.

from core.

pieroit commented on May 18, 2024 1

Increasing k is a good test, also if somebody uploads a doc and chooses a large chunk size the problem remains.

There should be a check before inserting memories in the prompt, if they are "too long" they should be summarized.

We can postpone the problem and close this issue as we are mostly covered, or if you feel like it also tackle the above.

Thanks 🙏

from core.

calebgcc commented on May 18, 2024

I closed issue #49 since this issue is more specific, we can use this issue to discuss how to further implement summarization 🙌.

About your comment in PR #52:

I can test other chain_type to see if I get the same problem with large files.

I'm going to dig a little deeper into the docs that you left (about llama-index) to understand better how to implement the custom summary chain, but if I understand correctly the basic idea is:

get a list of strings in input
group them in different docs
get summary from docs (which becomes new input)
repeat until we have one single short summary

from core.

pieroit commented on May 18, 2024

PR #68 merged and now file uploads do summarization.
Next step is do summaries when the list of memories recalled here makes the prompt too large.
Leaving this issue open

from core.

calebgcc commented on May 18, 2024

@pieroit I was trying to trigger this error, but I think summarizing and chunking the documents solved it.

The documents that are retrieved from the cat are often too small to cause problems, and this adds up to the fact that k is by default 5.

Maybe prompt summarization is no longer necessary, let me know how to proceed, for example we can try increasing the value of k to see how it affects the prompt.

from core.

Recommend Projects

prompts get too large about core HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent