mobarski / ask-my-pdf Goto Github PK
View Code? Open in Web Editor NEWQuestion answering system for PDF files
License: MIT License
Question answering system for PDF files
License: MIT License
I compare these two resourses,with the same pdf. chatpdf is with much-much-much more precise answers. It is very strange?
InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 13831 tokens (13831 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 561, in _run_script
self._session_state.on_script_will_rerun(rerun_data.widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 68, in on_script_will_rerun
self._state.on_script_will_rerun(latest_widget_states)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 474, in on_script_will_rerun
self._call_callbacks()
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 487, in _call_callbacks
self._new_widget_state.call_callback(wid)
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 242, in call_callback
callback(*args, **kwargs)
should i add on root?
also for what is this specify please?
thanks you so much
Redis configuration (for persistent usage statistics / user feedback):
DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.
Have you tried pdfplumber instead?
Question:
Is it possible to load multiple files at a time?
This way I could ask a question and it could search all the resource documents to compile an answer.
Is there any way to stream text by segmenting the fragments from model.query? The loading times to render the entire text block for larger PDFs is a bit too long.
could it be possible to recover stored vector indexes across sessions (same API key ) at least within 90 days?
Getting the following error when launching the app
File "C:\Users\xxx\Anaconda3\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "C:\Users\xxx\ask-my-pdf\src\gui.py", line 20, in <module> import model File "C:\Users\xxx\ask-my-pdf\src\model.py", line 12, in <module> import ai File "C:\Users\xxx\ask-my-pdf\src\ai.py", line 39, in <module> tokenizer_model = openai.model('text-davinci-003') File "C:\Users\xxx\Anaconda3\lib\site-packages\ai_bricks\api\openai.py", line 30, in model return _class(name, **kwargs) File "C:\Users\xxx\Anaconda3\lib\site-packages\ai_bricks\api\openai.py", line 57, in __init__ self.encoder = tiktoken.encoding_for_model(name)
I got the following error:
ClientError: An error occurred (InvalidArgument) when calling the ListObjects operation: Unknown
Traceback:
File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
exec(code, module.__dict__)
File "/Users/canalescl/personal/replit/ask-my-pdf/src/gui.py", line 248, in <module>
ui_pdf_file()
File "/Users/canalescl/personal/replit/ask-my-pdf/src/gui.py", line 91, in ui_pdf_file
filenames += ss['storage'].list()
File "/Users/canalescl/personal/replit/ask-my-pdf/src/storage.py", line 46, in list
return [self.decode(name) for name in self._list()]
File "/Users/canalescl/personal/replit/ask-my-pdf/src/storage.py", line 184, in _list
resp = self.s3.list_objects(
File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 530, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 960, in _make_api_call
raise error_class(parsed_response, operation_name)
A new folder, cloned from zero, after "Enter your OpenAI API key" and click Enter
I've set max fragments to 1 before and after to 0 and the model passes 3 fragments when obtaining the answer.
is it necessary to use open ia API key or if you have any advices to create the api from scratch?
Hi,
no problem running your demo but something on my side went wrong when trying to setup in run.sh these parameters:
STORAGE_MODE=LOCAL and CACHE_MODE=DISK.
No data is saved under cache/storage folder on disk.
Same problems on REDIS but maybe is something linked to issues above.
Any idea?
Thank you
Would it be possible to read the document and then get GPT to generate x number of questions and answers based on the text.
This could be used to train an AI bot to help answer questions.
My scenario would be around training an AI to be able to explain a company specific topic. Say you got it to read all internal documents on a certain subject - generated over many years. if it then generated 1000's of questions and answers - you could then train a bot to help users.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.