supabase-community / chatgpt-your-files Goto Github PK
View Code? Open in Web Editor NEWProduction-ready MVP for securely chatting with your documents using pgvector
Home Page: https://youtu.be/ibzlEQmgPPY
Production-ready MVP for securely chatting with your documents using pgvector
Home Page: https://youtu.be/ibzlEQmgPPY
any file extension chat
A clear and concise description of what you want and what your use case is.
A clear and concise description of what you want to happen.
A clear and concise description of any alternative solutions or features you've considered.
Add any other context or screenshots about the feature request here.
I am developing using the cloud, no docker.
When I run npm run build
, I get this error:
Text version:
./supabase/functions/_lib/markdown-parser.ts:1:35
Type error: Cannot find module 'mdast' or its corresponding type declarations.
> 1 | import { Root, RootContent } from 'mdast';
| ^
2 | import { fromMarkdown } from 'mdast-util-from-markdown';
3 | import { toMarkdown } from 'mdast-util-to-markdown';
4 | import { toString } from 'mdast-util-to-string';
In next.config.js
I tried adding to the nextConfig object exclude: ['supabase'],
but it does not work.
I tried adding this too but no banana:
experimental: {
serverActions: true,
ignore: [
'/supabase/functions/**/*',
],
},
Here is above both tries in the next.config file:
/** @type {import('next').NextConfig} */
const nextConfig = {
experimental: {
serverActions: true,
ignore: [
'/supabase/functions/**/*',
],
},
exclude: ['supabase/functions'],
webpack: (config) => {
config.resolve.alias = {
...config.resolve.alias,
sharp$: false,
'onnxruntime-node$': false,
};
return config;
},
};
module.exports = nextConfig;
As a workaround I deleted every folder inside supabase/functions
except lib
. And I deleted the markdown-parser.ts
file there since that's where the issue was. Then npm run build
worked.
Also why is mdast used? The npm page for it says it's deprecated and it's actually renamed remark, https://www.npmjs.com/package/mdast
Howdy, the tutorial seems to be in an inbetween state, where Deno is expected in some parts, but the new Edge Functions runtime is expected in others.
The upshot is that when running through the tutorial as described (i.e., checking out step-1
, step-2
, etc.) the embeddings step no longer works (locally at least), producing an error about a missing reference to Supabase:
ReferenceError: Supabase is not defined
at file:///home/deno/functions/embed/index.ts:4:15
I'm not 100% convinced on the root cause, I'm still working through things, will update if I get a clearer picture. Or close if I've just done something incorrectly along the way!
Run through the tutorial up to the embeddings portion.
Embeddings should be triggered on upload.
N/A
Can work around it by skipping the tutorial and run from the main branch, but miss that muscle memory building, step by step stuff :)
Supabase not installing via npm in WSL2
Trying supabase installation but fails every time.
A clear and concise description of what the bug is.
Steps to reproduce the behavior, please provide code snippets or a repository:
Install WSL2
Run sudo apt-get install -y nodejs
Run npm i -D [email protected]
If applicable, add screenshots to help explain your problem.
maryam@DESKTOP-KP4KII1:~/chatgpt-your-files$ npm i -D [email protected]
npm ERR! code 1
npm ERR! path \wsl.localhost\Ubuntu\home\maryam\chatgpt-your-files\node_modules\sharp
npm ERR! command failed
npm ERR! command C:\WINDOWS\system32\cmd.exe /d /s /c (node install/libvips && node install/dll-copy && prebuild-install) || (node install/can-compile && node-gyp rebuild && node install/dll-copy)
npm ERR! '\wsl.localhost\Ubuntu\home\maryam\chatgpt-your-files\node_modules\sharp'
npm ERR! CMD.EXE was started with the above path as the current directory.
npm ERR! UNC paths are not supported. Defaulting to Windows directory.
npm ERR! node:internal/modules/cjs/loader:1080
npm ERR! throw err;
npm ERR! ^
npm ERR!
npm ERR! Error: Cannot find module 'C:\Windows\install\libvips'
npm ERR! at Module._resolveFilename (node:internal/modules/cjs/loader:1077:15)
npm ERR! at Module._load (node:internal/modules/cjs/loader:922:27)
npm ERR! at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
npm ERR! at node:internal/main/run_main_module:23:47 {
npm ERR! code: 'MODULE_NOT_FOUND',
npm ERR! requireStack: []
npm ERR! }
npm ERR!
npm ERR! Node.js v18.17.1
npm ERR! node:internal/modules/cjs/loader:1080
npm ERR! throw err;
npm ERR! ^
npm ERR!
npm ERR! Error: Cannot find module 'C:\Windows\install\can-compile'
npm ERR! at Module._resolveFilename (node:internal/modules/cjs/loader:1077:15)
npm ERR! at Module._load (node:internal/modules/cjs/loader:922:27)
npm ERR! at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
npm ERR! at node:internal/main/run_main_module:23:47 {
npm ERR! code: 'MODULE_NOT_FOUND',
npm ERR! requireStack: []
npm ERR! }
npm ERR!
npm ERR! Node.js v18.17.1
npm ERR! A complete log of this run can be found in: C:\Users\marya\AppData\Local\npm-cache_logs\2024-03-13T10_33_11_606Z-debug-0.log
maryam@DESKTOP-KP4KII1:~/chatgpt-your-files$
Add any other context about the problem here.
Seeing a few issues like:
[Info] Generated embedding {"table":"document_sections","id":71,"contentColumn":"content","embeddingColumn":"embedding"}
CPU time limit reached. isolate: 10342867669556939419
event loop error: Uncaught Error: execution terminated
failed to send request to user worker: connection closed before message completed
failed to send request to user worker: connection closed before message completed
InvalidWorkerResponse: user worker failed to respond
at async Promise.all (index 1)
at async UserWorker.fetch (ext:sb_user_workers/user_workers.js:64:19)
at async Server.<anonymous> (file:///home/deno/main/index.ts:128:12)
at async Server.#respond (https://deno.land/[email protected]/http/server.ts:220:18) {
name: "InvalidWorkerResponse"
}
InvalidWorkerResponse: user worker failed to respond
at async Promise.all (index 1)
at async UserWorker.fetch (ext:sb_user_workers/user_workers.js:64:19)
at async Server.<anonymous> (file:///home/deno/main/index.ts:128:12)
at async Server.#respond (https://deno.land/[email protected]/http/server.ts:220:18) {
name: "InvalidWorkerResponse"
}
serving the request with /home/deno/functions/chat
serving the request with /home/deno/functions/chat
And
[Info] Saved 25 sections for file 'roman-empire-3.md'
serving the request with /home/deno/functions/embed
serving the request with /home/deno/functions/embed
serving the request with /home/deno/functions/embed
client connection error (hyper::Error(IncompleteMessage))
client connection error (hyper::Error(IncompleteMessage))
client connection error (hyper::Error(IncompleteMessage))
[Info] Generated embedding {"table":"document_sections","id":66,"contentColumn":"content","embeddingColumn":"embedding"}
And:
⨯ node_modules/@xenova/transformers/src/env.js (60:0) @ eval
⨯ TypeError: Cannot read properties of undefined (reading 'wasm')
at __webpack_require__ (/code/chatgpt-your-files/.next/server/webpack-runtime.js:33:43)
at __webpack_require__ (/code/chatgpt-your-files/.next/server/webpack-runtime.js:33:43)
at __webpack_require__ (/code/chatgpt-your-files/.next/server/webpack-runtime.js:33:43)
at __webpack_require__ (/code/chatgpt-your-files/.next/server/webpack-runtime.js:33:43)
at __webpack_require__ (/code/chatgpt-your-files/.next/server/webpack-runtime.js:33:43)
at eval (./lib/hooks/use-pipeline.ts:5:78)
at (ssr)/./lib/hooks/use-pipeline.ts (/code/chatgpt-your-files/.next/server/app/chat/page.js:423:1)
at __webpack_require__ (/code/chatgpt-your-files/.next/server/webpack-runtime.js:33:43)
at eval (./app/chat/page.tsx:10:81)
at (ssr)/./app/chat/page.tsx (/code/chatgpt-your-files/.next/server/app/chat/page.js:357:1)
at __webpack_require__ (/code/chatgpt-your-files/.next/server/webpack-runtime.js:33:43)
at JSON.parse (<anonymous>)
null
Wondering what that could be about. Thanks for the great workshop!
Lets say there is an issue with chunking and only half the document got chunked and there was timeout after. There is no easy way to inform user to try it again. ( delete the uploaded file and retry it)
supabase should provide a way to surface the error to user indicating something failed while running background task so they can try it again.
alternative is to have a table to track individual states and alert user to try again if a state has not been reached after sometime.
Add any other context or screenshots about the feature request here.
When I run npm run dev, open localohst:3000, and try to sign up, after clicking sign up it gives me the error "Error: Hydration failed because the initial UI does not match what was rendered on the server." That was with the default starter.
gte-small only supports embeddings for English texts.
Use a multilingual embeddings model instead so that more languages can be supported.
This project's readme
This is an amazing tutorial, hats off. I thought I could offer some user feedback.
I faced the following confusions while following along, feel free to disregard and close this issue:
Part 2:
ON DELETE CASCADE
to the DDL of table document_sections
to make real life easier when deleting documents etc...Part 3:
transformers.js
requires a model that has a ONXX runtime. You explain ONXX in the video. But you gloss over the fact that to make this tutorial work, Supabase converted the model to ONXX themselves and created Supabase/gte-small
. Indeed the original model thenlper/gte-small
raises this error in the Deno function runtime
event loop error: Error: Could not locate file: "https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model_quantized.onnx"
npx supabase db reset
to get itself out of that situation. But in a more production setting, how do you handle the rows that fall behind? relying on postgres triggers is amazing, but the functions are created in the private
routine schema. So this means we can't run the function in the SQL Editor for example.
select
routine_name AS function_name,
routine_schema,
data_type
from information_schema.routines
where routine_type = 'FUNCTION' and routine_schema = 'private';
Part 4:
Following function
create function private.handle_storage_update()
returns trigger
language plpgsql
as $$
declare
document_id bigint;
result int;
begin
insert into documents (name, storage_object_id, created_by)
values (new.path_tokens[2], new.id, new.owner)
returning id into document_id;
select
net.http_post(
url := supabase_url() || '/functions/v1/process',
headers := jsonb_build_object(
'Content-Type', 'application/json',
'Authorization', current_setting('request.headers')::json->>'authorization'
),
body := jsonb_build_object(
'document_id', document_id
)
)
into result;
return null;
end;
$$;
create trigger on_file_upload
after insert on storage.objects
for each row
execute procedure private.handle_storage_update();
throws "unrecognized configuration parameter "request.headers". Any idea why this happens?
Currently trying to follow step 3. after executing npx supabase functions serve , and i upload the files on the website, i can see the files in the document section of the table editor but i all embedding vecotrs are NULL.
This is the error I am receiving:
Segmentation fault (core dumped) 2024/03/24 03:41:56 Sent Header: Host [api.moby.localhost] 2024/03/24 03:41:56 Sent Header: User-Agent [Docker-Client/unknown-version (linux)] 2024/03/24 03:41:56 Send Done 2024/03/24 03:41:57 Recv First Byte error running container: exit 139
Steps to reproduce the behavior, please provide code snippets or a repository:
git checkout step-4
& npm supabase start
npm run dev
& npx supabase functions serve
npx supabase functions serve
, this error would be shownThe file should have been uploaded and embedding vectors should have been shown in the table editor.
(https://github.com/supabase-community/chatgpt-your-files/blob/main/README.md)
jon@local~/c/p/chatgpt-your-files $ npx supabase functions deploy
Bundling chat
failed to start docker container: Error response from daemon: failed to create task for container: failed to initialize logging driver: dial tcp 127.0.0.1:54328: connect: connection refused
This part is killing me. I am doing the tutorial fully in cloud and get stuck on deploying functions. It seems like it is trying to use local docker.
First of all, thank you for this great resource!
Trying to follow along the video you posted a few days ago, but when I upload a md file and the edge functions start I get a
shutdown (reason: CPU time limit reached)
I am using the same rome (sample_files) you used in the video.
One key difference is that I am not running it locally, but in the cloud (in supabase.com)
npx supabase db push
npm run dev
To not throw a warning and shutdown before all the rows has been processed.
Fixed, sharing in case anyone else gets same error
I did npx supabase db push
and npx supabase functions deploy
. Under supabase/functions I created a .env and added my Open API key. In env.local I added my supabase project url and anon key. I just duoble checked both. I also added my supabase url to the seed.sql.
I then ran npm run build
then npm run start
. When I open up the website then, I try uploading a file and get an error on the bottom right.
The solution was me not adding my supabase url in seed.sql before running npx supabase db push
. Then when I did add it and reran npx supabase db push
, supabase had already applied seed.sql and did not care if it changed.
The file looks like this:
select vault.create_secret(
'https://blah blah balh.supabase.co',
'supabase_url'
);
See how my vault has no secrets:
I just added it manually to the vault:
Now I can upload a file just fine.
The error is the following:
error: Uncaught (in promise) Error: Relative import path "http" not prefixed with / or ./ or ../ and not in import map from "https://esm.sh/v135/@types/[email protected]/http.d.ts"
const ret = new Error(getStringFromWasm0(arg0, arg1));
Try to deply the function in Supabase
Deploymen is successful
I have deployed this function last week and got it working no issues, but now it is failing. It is like a sub module has changed / been updated
Similar to #3 but embeddings do get created.
When running locally, get the following errors in edge runtime:
client connection error (hyper::Error(IncompleteMessage))
wall clock duration reached
Embeddings are generated but the edge function always hits the wall clock limit. Doesn't seem to be terminated properly?
Here's an example, each call of the 'embed' function generates a similar console output:
2023-12-20 11:08:26 serving the request with /home/deno/functions/embed\n
2023-12-20 11:08:28 client connection error (hyper::Error(IncompleteMessage))\n
2023-12-20 11:08:42 [Info] Generated embedding {"table":"document_sections","id":11,"contentColumn":"content","embeddingColumn":"embedding"}
2023-12-20 11:11:45 wall clock duration warning. isolate: d49cf7c6-5ea0-4197-828a-e474bcea6940
2023-12-20 11:15:05 wall clock duration reached. isolate: d49cf7c6-5ea0-4197-828a-e474bcea6940
I've not got a great amount of experience with edge functions in Supabase so not sure if this is expected behaviour but it seems like the edge function is being kept alive when it should be terminated after generating the embedding.
The client connection error seems to be related to the hyper client in Deno and I'm not sure how to troubleshoot that.
Run the example as per readme
No warnings, function to terminate before wall clock warnings.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.