yoshoku / hnswlib-node Goto Github PK
View Code? Open in Web Editor NEWhnswlib-node provides Node.js bindings for Hnswlib
Home Page: https://www.npmjs.com/package/hnswlib-node
License: Apache License 2.0
hnswlib-node provides Node.js bindings for Hnswlib
Home Page: https://www.npmjs.com/package/hnswlib-node
License: Apache License 2.0
Hi there! This is great.
Seems that cosine
metric is missing. Based on the python bindings it should be as easy as saving a normalize
flag somewhere in HierarchicalNSW
of addon.cc and using IP + normalizing all the vectors if true.
Hey, I don't know if this is something I'm doing wrong, or if it's an issue with the library but thought i'd flag it / ask for help.
I've created vector embeddings for 275,000~ ish words from an english dictionary using ada-002
and i've added them to an index with the code below.
Whenever I search with it, it's always returning the same set of words regardless of what the query embedding is.
Is this a problem with the number of embeddings i'm supplying? Am I doing something else wrong?
Here is my code:
import pkg from 'hnswlib-node'
const { HierarchicalNSW } = pkg
export const createIndexCallback = async (name, dimensions, maxElements, callback) => {
// this needs to *get the element from the callback each time*.
const index = new HierarchicalNSW('l2', dimensions)
index.initIndex(maxElements)
for (let i = 0; i < maxElements; i++) {
const embedding = await callback(i)
index.addPoint(embedding, i)
console.log(`Added ${i} of ${maxElements}`)
}
index.writeIndexSync(`${name}.dat`)
return index
}
export const searchIndex = (name, embedding, k = 5) => {
const index = new HierarchicalNSW('l2', embedding.length)
index.readIndexSync(`${name}.dat`)
const result = index.searchKnn(embedding, k)
console.table(result)
return result
}
Is the HSNW index automatically garbage collected, or is it recommended that we "free" the memory when we're done? I was wondering if something like resizeIndex would work?
I ask because I intermittently get this error:
FATAL ERROR: Error::New napi_get_last_error_info
1: 0x10253549c node::Abort() [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
2: 0x102535594 node::OOMErrorHandler(char const*, bool) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
3: 0x1025354b4 node::OnFatalError(char const*, char const*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
4: 0x102504678 napi_open_callback_scope [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
5: 0x10f959cb8 Napi::Error::Error(napi_env__*, napi_value__*) [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
6: 0x10f95950c Napi::Error::New(napi_env__*) [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
7: 0x10f969108 Napi::Value::ToBoolean() const [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
8: 0x10f969094 CustomFilterFunctor::operator()(unsigned long) [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
9: 0x10f976e98 std::__1::priority_queue<std::__1::pair<float, unsigned int>, std::__1::vector<std::__1::pair<float, unsigned int>, std::__1::allocator<std::__1::pair<float, unsigned int> > >, hnswlib::HierarchicalNSW<float>::CompareByFirst> hnswlib::HierarchicalNSW<float>::searchBaseLayerST<false, true>(unsigned int, void const*, unsigned long, hnswlib::BaseFilterFunctor*) const [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
10: 0x10f97077c hnswlib::HierarchicalNSW<float>::searchKnn(void const*, unsigned long, hnswlib::BaseFilterFunctor*) const [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
11: 0x10f96d554 HierarchicalNSW::searchKnn(Napi::CallbackInfo const&) [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
12: 0x10f96fafc Napi::InstanceWrap<HierarchicalNSW>::InstanceMethodCallbackWrapper(napi_env__*, napi_callback_info__*)::'lambda'()::operator()() const [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
13: 0x10f96f8a8 Napi::InstanceWrap<HierarchicalNSW>::InstanceMethodCallbackWrapper(napi_env__*, napi_callback_info__*) [/Users/neha/my_app/node_modules/.pnpm/[email protected]/node_modules/hnswlib-node/build/Release/addon.node]
14: 0x1024f6990 v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
15: 0x1026feb8c v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
16: 0x1026fe688 v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
17: 0x1026fdeb4 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
18: 0x102eed18c Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
19: 0x102e78198 Builtins_InterpreterEntryTrampoline [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
20: 0x102eac0d0 Builtins_GeneratorPrototypeNext [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
21: 0xf3b6af7f4
22: 0x102f36b88 Builtins_PromiseConstructor [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
23: 0x102e75914 Builtins_JSBuiltinsConstructStub [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
24: 0xf3b6afb30
25: 0x102e78198 Builtins_InterpreterEntryTrampoline [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
26: 0x102e78198 Builtins_InterpreterEntryTrampoline [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
27: 0x102eac0d0 Builtins_GeneratorPrototypeNext [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
28: 0xf3b6af7f4
29: 0x102f36b88 Builtins_PromiseConstructor [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
30: 0x102e75914 Builtins_JSBuiltinsConstructStub [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
31: 0xf3b6afb30
32: 0x102e78198 Builtins_InterpreterEntryTrampoline [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
33: 0x102e78198 Builtins_InterpreterEntryTrampoline [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
34: 0x102eac0d0 Builtins_GeneratorPrototypeNext [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
35: 0xf3b71df38
36: 0x102f38738 Builtins_PromiseFulfillReactionJob [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
37: 0x102e9bc4c Builtins_RunMicrotasks [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
38: 0x102e763a4 Builtins_JSRunMicrotasksEntry [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
39: 0x1027ba99c v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
40: 0x1027bae8c v8::internal::(anonymous namespace)::InvokeWithTryCatch(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
41: 0x1027bb068 v8::internal::Execution::TryRunMicrotasks(v8::internal::Isolate*, v8::internal::MicrotaskQueue*, v8::internal::MaybeHandle<v8::internal::Object>*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
42: 0x1027e17b4 v8::internal::MicrotaskQueue::RunMicrotasks(v8::internal::Isolate*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
43: 0x1027e204c v8::internal::MicrotaskQueue::PerformCheckpoint(v8::Isolate*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
44: 0x102e79a34 Builtins_CallApiCallback [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
45: 0xf3b75534c
46: 0x102e764d0 Builtins_JSEntryTrampoline [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
47: 0x102e76164 Builtins_JSEntry [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
48: 0x1027ba9cc v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
49: 0x1027b9f00 v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
50: 0x1026aa294 v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
51: 0x102484d18 node::InternalCallbackScope::Close() [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
52: 0x102484fe8 node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
53: 0x102499c14 node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
54: 0x1025eb77c node::StreamBase::CallJSOnreadMethod(long, v8::Local<v8::ArrayBuffer>, unsigned long, node::StreamBase::StreamBaseJSChecks) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
55: 0x1025ece34 node::EmitToJSStreamListener::OnStreamRead(long, uv_buf_t const&) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
56: 0x10265d760 node::crypto::TLSWrap::ClearOut() [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
57: 0x10265f45c node::crypto::TLSWrap::OnStreamRead(long, uv_buf_t const&) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
58: 0x1025f0e10 node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
59: 0x1025f15ac node::LibuvStreamWrap::ReadStart()::$_1::__invoke(uv_stream_s*, long, uv_buf_t const*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
60: 0x102e61fd8 uv__stream_io [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
61: 0x102e6a0f0 uv__io_poll [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
62: 0x102e57e1c uv_run [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
63: 0x102485704 node::SpinEventLoop(node::Environment*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
64: 0x1025d89ec node::worker::Worker::Run() [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
65: 0x1025db7a4 node::worker::Worker::StartThread(v8::FunctionCallbackInfo<v8::Value> const&)::$_3::__invoke(void*) [/Users/neha/.nvm/versions/node/v18.15.0/bin/node]
66: 0x19891e06c _pthread_start [/usr/lib/system/libsystem_pthread.dylib]
67: 0x198918e2c thread_start [/usr/lib/system/libsystem_pthread.dylib]
It's worth noting that I use the filterFunction
argument quite frequently to see if the index is in some pre-defined set of indices.
Thanks!
I cant install hnswlib-node :/
PS D:\langchain-workshop-master> pnpm i hnswlib-node Packages: +5 -1 +++++- Progress: resolved 116, reused 94, downloaded 0, added 5, done node_modules/.pnpm/[email protected]/node_modules/hnswlib-node: Running install script, failed in 1.2s .../node_modules/hnswlib-node install$ node-gyp rebuild │ D:\langchain-workshop-master\node_modules\.pnpm\[email protected]\node_modules\hnswlib-node>if not defined npm_con… │ gyp info it worked if it ends with ok │ gyp info using [email protected] │ gyp info using [email protected] | win32 | x64 │ gyp info find Python using Python version 3.12.1 found at "C:\Users\FAMILIA\AppData\Local\Programs\Python\Python312… │ gyp info find VS using VS2022 (17.8.34511.84) found at: │ gyp info find VS "C:\Program Files\Microsoft Visual Studio\2022\Community" │ gyp info find VS run with --verbose for detailed information │ gyp info spawn C:\Users\FAMILIA\AppData\Local\Programs\Python\Python312\python.exe │ gyp info spawn args [ │ gyp info spawn args 'C:\\Users\\FAMILIA\\AppData\\Roaming\\npm\\node_modules\\pnpm\\dist\\node_modules\\node-gyp\… │ gyp info spawn args 'binding.gyp', │ gyp info spawn args '-f', │ gyp info spawn args 'msvs', │ gyp info spawn args '-I', │ gyp info spawn args 'D:\\langchain-workshop-master\\node_modules\\.pnpm\\[email protected]\\node_modules\\hnswli… │ gyp info spawn args '-I', │ gyp info spawn args 'C:\\Users\\FAMILIA\\AppData\\Roaming\\npm\\node_modules\\pnpm\\dist\\node_modules\\node-gyp\… │ gyp info spawn args '-I', │ gyp info spawn args 'C:\\Users\\FAMILIA\\AppData\\Local\\node-gyp\\Cache\\18.17.1\\include\\node\\common.gypi', │ gyp info spawn args '-Dlibrary=shared_library', │ gyp info spawn args '-Dvisibility=default', │ gyp info spawn args '-Dnode_root_dir=C:\\Users\\FAMILIA\\AppData\\Local\\node-gyp\\Cache\\18.17.1', │ gyp info spawn args '-Dnode_gyp_dir=C:\\Users\\FAMILIA\\AppData\\Roaming\\npm\\node_modules\\pnpm\\dist\\node_mod… │ gyp info spawn args '-Dnode_lib_file=C:\\\\Users\\\\FAMILIA\\\\AppData\\\\Local\\\\node-gyp\\\\Cache\\\\18.17.1\\… │ gyp info spawn args '-Dmodule_root_dir=D:\\langchain-workshop-master\\node_modules\\.pnpm\\[email protected]\\no… │ gyp info spawn args '-Dnode_engine=v8', │ gyp info spawn args '--depth=.', │ gyp info spawn args '--no-parallel', │ gyp info spawn args '--generator-output', │ gyp info spawn args 'D:\\langchain-workshop-master\\node_modules\\.pnpm\\[email protected]\\node_modules\\hnswli… │ gyp info spawn args '-Goutput_dir=.' │ gyp info spawn args ] │ Traceback (most recent call last): │ File "C:\Users\FAMILIA\AppData\Roaming\npm\node_modules\pnpm\dist\node_modules\node-gyp\gyp\gyp_main.py", line 42… │ import gyp # noqa: E402 │ ^^^^^^^^^^ │ File "C:\Users\FAMILIA\AppData\Roaming\npm\node_modules\pnpm\dist\node_modules\node-gyp\gyp\pylib\gyp\__init__.py… │ import gyp.input │ File "C:\Users\FAMILIA\AppData\Roaming\npm\node_modules\pnpm\dist\node_modules\node-gyp\gyp\pylib\gyp\input.py", … │ from distutils.version import StrictVersion │ ModuleNotFoundError: No module named 'distutils' │ gyp ERR! configure error │ gyp ERR! stack Error:
gyp failed with exit code: 1 │ gyp ERR! stack at ChildProcess.onCpExit (C:\Users\FAMILIA\AppData\Roaming\npm\node_modules\pnpm\dist\node_modul… │ gyp ERR! stack at ChildProcess.emit (node:events:514:28) │ gyp ERR! stack at ChildProcess._handle.onexit (node:internal/child_process:291:12) │ gyp ERR! System Windows_NT 10.0.19045 │ gyp ERR! command "C:\\Program Files\\nodejs\\node.exe" "C:\\Users\\FAMILIA\\AppData\\Roaming\\npm\\node_modules\\pn… │ gyp ERR! cwd D:\langchain-workshop-master\node_modules\.pnpm\[email protected]\node_modules\hnswlib-node │ gyp ERR! node -v v18.17.1 │ gyp ERR! node-gyp -v v9.4.1 │ gyp ERR! not ok └─ Failed in 1.2s at D:\langchain-workshop-master\node_modules\.pnpm\[email protected]\node_modules\hnswlib-node ELIFECYCLE Command failed with exit code 1.
When trying to use the library on AWS Lambda, I'm getting the following error:
"/var/task/node_modules/hnswlib-node/build/Release/addon.node: invalid ELF header"
.
It looks like it's a binding issue?
Hi
I was trying to use markDelete by importing the library as standalone or by using it through langchain.
I had to do it using the library itself as langchain does not provide a way to do it.
At the same time, I edit the doctore.json to remove the points marked at deleted.
Then I rebuild the index hoping that the points have been removed.
Is this correct ?
At the same time, is there a solution to simply edit the metadata of points without deleting / removing them ?
I mean if I change the metadata in the doctore.json, then rebuild the index , is this also correct ?
Thanks
Hi!
Thank you for the great library. I was wondering if there was any way to add/delete elements in an index without having to rebuild it entirely?
It seems like hnswlib does support it but I haven’t found any documentation about it in hnswlib-node.
Thanks again!
I recently made a simple Typescript function to create a VectorStore using HNSWLib-node.
It saves the vector store in a folder and then, in another script file, I load and execute a RetrievalQAChain using OpenAI.
Everything was working fine until I decided to put that in a AWS Lambda Function.
My package.json has the following dependencies:
"hnswlib-node": "^1.4.2",
"langchain": "^0.0.59",
Also, I double checked and the hnswlib-node folder is inside "node_modules" folder in my lambda function folder.
However, I keep getting the following error (from CloudWatch Logs):
ERROR Invoke Error {"errorType":"Error","errorMessage":"Please install hnswlib-node as a dependency with,
e.g. `npm install -S hnswlib-node`",
"stack":["Error: Please install hnswlib-node as a dependency with, e.g. `npm install -S hnswlib-node`","
at Function.imports (/var/task/node_modules/langchain/dist/vectorstores/hnswlib.cjs:161:19)","
at async Function.getHierarchicalNSW (/var/task/node_modules/langchain/dist/vectorstores/hnswlib.cjs:38:37)","
at async Function.load (/var/task/node_modules/langchain/dist/vectorstores/hnswlib.cjs:123:23)","
at async AMCompanion (/var/task/index.js:18:29)"," at async Runtime.exports.handler (/var/task/index.js:39:22)"]}
Also, this error is not thrown on importing HNSWLib, but only in the following line of code:
const vectorStore = await HNSWLib.load("data", new OpenAIEmbeddings(
{
openAIApiKey: process.env.OPENAI_API_KEY,
}
))
This is my import:
const { HNSWLib } = require("langchain/vectorstores/hnswlib")
It seems I'm not the only one with this problem. See this post
Expeted behavior: code would be executed properly, just like when executed on my local machine.
Actual behavior: the error pasted above.
{"time":1681147316100,"hostname":"6431b4565cfa36c45b618ba6","pid":31376,"level":"error","name":"pnpm","code":"ELIFECYCLE","errno":1,"pkgid":"[email protected]","stage":"install","script":"node-gyp rebuild","pkgname":"hnswlib-node","name":"pnpm","err":{"name":"pnpm","message":"[email protected] install: node-gyp rebuild
\nExit status 1","code":"ELIFECYCLE","stack":"pnpm: [email protected] install: node-gyp rebuild
\nExit status 1\n at EventEmitter. (/usr/lib/node_modules/pnpm/dist/pnpm.cjs:99653:17)\n at EventEmitter.emit (node:events:513:28)\n at ChildProcess. (/usr/lib/node_modules/pnpm/dist/pnpm.cjs:83511:18)\n at ChildProcess.emit (node:events:513:28)\n at maybeClose (node:internal/child_process:1091:16)\n at ChildProcess._handle.onexit (node:internal/child_process:302:5)"}}
$ npm i hnswlib-node
npm ERR! code 1
npm ERR! path C:\sandbox\biz-int-starship\functions\node_modules\node
npm ERR! command failed
npm ERR! command C:\WINDOWS\system32\cmd.exe /d /s /c node installArchSpecificPackage
npm ERR! npm ERR! code EBADPLATFORM
npm ERR! npm ERR! notsup Unsupported platform for [email protected]: wanted {"os":"win32","arch":"x86"} (current: {"os":"win32","arch":"ia32"})
npm ERR! npm ERR! notsup Valid OS: win32
npm ERR! npm ERR! notsup Valid Arch: x86
npm ERR! npm ERR! notsup Actual OS: win32
npm ERR! npm ERR! notsup Actual Arch: ia32
npm ERR!
npm ERR! npm ERR! A complete log of this run can be found in:
npm ERR! npm ERR! C:\Users\alexv\AppData\Local\npm-cache_logs\2023-03-11T02_53_13_850Z-debug-0.log
npm ERR! node:internal/modules/cjs/loader:1078
npm ERR! throw err;
npm ERR! ^
npm ERR!
npm ERR! Error: Cannot find module 'node-win-x86/package.json'
npm ERR! Require stack:
npm ERR! - C:\sandbox\biz-int-starship\functions\node_modules\node\installArchSpecificPackage.js
npm ERR! at Module._resolveFilename (node:internal/modules/cjs/loader:1075:15)
npm ERR! at Function.resolve (node:internal/modules/cjs/helpers:116:19)
npm ERR! at ChildProcess. (C:\sandbox\biz-int-starship\functions\node_modules\node-bin-setup\index.js:19:27)
npm ERR! at ChildProcess.emit (node:events:513:28)
npm ERR! at maybeClose (node:internal/child_process:1091:16)
npm ERR! at ChildProcess._handle.onexit (node:internal/child_process:302:5) {
npm ERR! code: 'MODULE_NOT_FOUND',
npm ERR! requireStack: [
npm ERR! 'C:\sandbox\biz-int-starship\functions\node_modules\node\installArchSpecificPackage.js'
npm ERR! ]
npm ERR! }
npm ERR!
npm ERR! Node.js v18.15.0
npm ERR! A complete log of this run can be found in:
npm ERR! C:\Users\alexv\AppData\Local\npm-cache_logs\2023-03-11T02_53_10_649Z-debug-0.log
Is there any way I could load the index from url (like git lfs or github release attachment)? The current size of my hnswlib.index
is 175 MB, which goes over Vercel's limit of 50 MB for serverless functions. It looks like the loadIndex
function does have to be a file path, since it feeds into an ifstream. Is there anything I can do?
Hello, I am running into this error when attempting to install hnswlib-node
>> npm install hnswlib-node
npm ERR! code EUNSUPPORTEDPROTOCOL
npm ERR! Unsupported URL Type "link:": link:@types/emotion/cache
npm ERR! A complete log of this run can be found in:
npm ERR! /home/kac487/.npm/_logs/2023-04-21T02_29_33_748Z-debug-0.log
Here are the details from the contents of the log file referenced above
...
...
227 timing idealTree Completed in 18918ms
228 timing command:install Completed in 18925ms
229 verbose stack Error: Unsupported URL Type "link:": link:@types/emotion/cache
229 verbose stack at unsupportedURLType (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/npm-package-arg/lib/npa.js:327:15)
229 verbose stack at fromURL (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/npm-package-arg/lib/npa.js:387:13)
229 verbose stack at Function.resolve (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/npm-package-arg/lib/npa.js:83:12)
229 verbose stack at [nodeFromEdge] (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/@npmcli/arborist/lib/arborist/build-ideal-tree.js:1058:22)
229 verbose stack at [buildDepStep] (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/@npmcli/arborist/lib/arborist/build-ideal-tree.js:929:36)
229 verbose stack at async Arborist.buildIdealTree (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/@npmcli/arborist/lib/arborist/build-ideal-tree.js:207:7)
229 verbose stack at async Promise.all (index 1)
229 verbose stack at async Arborist.reify (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/@npmcli/arborist/lib/arborist/reify.js:159:5)
229 verbose stack at async Install.exec (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/lib/commands/install.js:146:5)
229 verbose stack at async module.exports (/home/kac487/.nvm/versions/node/v18.15.0/lib/node_modules/npm/lib/cli.js:134:5)
230 verbose cwd /home/kac487/dev/project/repos/projectDirectory
231 verbose Linux 5.19.0-40-generic
232 verbose node v18.15.0
233 verbose npm v9.5.0
234 error code EUNSUPPORTEDPROTOCOL
235 error Unsupported URL Type "link:": link:@types/emotion/cache
236 verbose exit 1
237 timing npm Completed in 18951ms
238 verbose unfinished npm timer reify 1682044173773
239 verbose unfinished npm timer reify:loadTrees 1682044173776
240 verbose unfinished npm timer idealTree:buildDeps 1682044173789
241 verbose unfinished npm timer idealTree:#root 1682044173789
242 verbose code 1
243 error A complete log of this run can be found in:
243 error /home/kac487/.npm/_logs/2023-04-21T02_29_33_748Z-debug-0.log
I've tried using npm and pnpm to install hnswlib-node but i haven't had any luck with either. Any help would be greatly appreciated.
its not work on bunjs. Is it possible to be bun compatible?
This is the code that I am using
import {RetrievalQAChain} from 'langchain/chains';
import {HNSWLib} from "langchain/vectorstores";
import {RecursiveCharacterTextSplitter} from 'langchain/text_splitter';
import {LLamaEmbeddings} from "llama-node/dist/extensions/langchain.js";
import {LLM} from "llama-node";
import {LLamaCpp} from "llama-node/dist/llm/llama-cpp.js";
import * as fs from 'fs';
import * as path from 'path';
const txtFilename = "TrainData";
const txtPath = ./${txtFilename}.txt;
const VECTOR_STORE_PATH = ${txtFilename}.index;
const model = path.resolve(process.cwd(), './h2ogptq-oasst1-512-30B.ggml.q5_1.bin');
const llama = new LLM(LLamaCpp);
const config = {
path: model,
enableLogging: true,
nCtx: 1024,
nParts: -1,
seed: 0,
f16Kv: false,
logitsAll: false,
vocabOnly: false,
useMlock: false,
embedding: true,
useMmap: true,
};
var vectorStore;
const run = async () => {
await llama.load(config);
if (fs.existsSync(VECTOR_STORE_PATH)) {
console.log('Vector Exists..');
vectorStore = await HNSWLib.fromExistingIndex(VECTOR_STORE_PATH, new LLamaEmbeddings({maxConcurrency: 1}, llama));
} else {
console.log('Creating Documents');
const text = fs.readFileSync(txtPath, 'utf8');
const textSplitter = new RecursiveCharacterTextSplitter({chunkSize: 1000});
const docs = await textSplitter.createDocuments([text]);
console.log('Creating Vector');
vectorStore = await HNSWLib.fromDocuments(docs, new LLamaEmbeddings({maxConcurrency: 1}, llama));
await vectorStore.save(VECTOR_STORE_PATH);
}
console.log('Testing Vector via Similarity Search');
const resultOne = await vectorStore.similaritySearch("what is a template", 1);
console.log(resultOne);
console.log('Testing Vector via RetrievalQAChain');
const chain = RetrievalQAChain.fromLLM(llama, vectorStore.asRetriever());
const res = await chain.call({
query: "what is a template",
});
console.log({res});
};
run();
It is only using 4 CPU at the time of "vectorStore = await HNSWLib.fromDocuments(docs, new LLamaEmbeddings({maxConcurrency: 1}, llama));"
Can we change anything for it to use more than 4 CPU?
Wondering if it is safe to upgrade to v2.0.0, is this a breaking change?
While the exact same docker image builds just fine on the linux AWS EC2 servers and in github actions, on my M2 Chip MacBook I get
#0 14.81 file:///app/node_modules/langchain/dist/vectorstores/hnswlib.js:172
#0 14.81 throw new Error("Please install hnswlib-node as a dependency with, e.g. `npm install -S hnswlib-node`");
#0 14.81 ^
#0 14.81
#0 14.81 Error: Please install hnswlib-node as a dependency with, e.g. `npm install -S hnswlib-node`
#0 14.81 at HNSWLib.imports (file:///app/node_modules/langchain/dist/vectorstores/hnswlib.js:172:19)
#0 14.81 at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
#0 14.81 at async HNSWLib.getHierarchicalNSW (file:///app/node_modules/langchain/dist/vectorstores/hnswlib.js:35:37)
#0 14.81 at async HNSWLib.initIndex (file:///app/node_modules/langchain/dist/vectorstores/hnswlib.js:49:26)
#0 14.81 at async HNSWLib.addVectors (file:///app/node_modules/langchain/dist/vectorstores/hnswlib.js:68:9)
#0 14.81 at async HNSWLib.fromDocuments (file:///app/node_modules/langchain/dist/vectorstores/hnswlib.js:163:9)
when running HNSWLib.fromDocuments(docs, new OpenAIEmbeddings({...embeddingsOpts, openAIApiKey:OPENAI_API_KEY}));
is called.
So it works on the server (luckily), but not on my local machine in docker 🤔 😕
Dockerfile:
FROM node:18-alpine AS BUILD_IMAGE
RUN apk update && \
apk add -q make g++ python3
.
.
.
@yoshoku i just wanted to say great work, i'm glad i found this lib. The change log docs are extremely helpful. Looking into to using it now!
Hi,
I am using
"hnswlib-node": "^2.0.0",
"langchain": "^0.0.144",
All works locally but when deployed to Lambda (node 18) I get:
"Runtime.ImportModuleError: Error: Cannot find module '../stores/doc/in_memory.cjs'",
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.