amikos-tech / chromadb-chart Goto Github PK
View Code? Open in Web Editor NEWChart for deploying ChromaDB in Kubernetes
License: MIT License
Chart for deploying ChromaDB in Kubernetes
License: MIT License
Adding CHROMA_SERVER_NOFILE=65535
helps running large workloads e.g. multiple concurrent clients. It would be nice to add it to the template + entry file
I've installed ChromaDB with the helm chart, and enabled the ingress in values.yaml. However I'm always getting Bad Gateways 502. Is there some graphical interface which I'm supposed to see when exposing via ingress or what is it used for?
Otherwise the pod is up and running and I was able to get the token from the default secret. So am I supposed to to be able to reach it via Postman, for example, by hostname specified in my ingress and using the Authorization + Bearer token in the header of the API call? Apologies, I'm new to ChromaDB and I cannot find description on how to connect to it, how to use it and etc.
Thanks in advance!
0.4.3
1.26.x
AKS, with an azure application gateway as an ingress controller.
No response
Users are starting to adopt Chroma in cloud provider settings with vendor-specific version of k8s.
Update the chart to support EKS, GKS and AKS k8s distros.
N/A
I cannot run the chart without it
No response
After switching to a new node pool we started encountering this issue. When using the 0.4.20
version we also see SyntaxWarning: "is" with a literal. Did you mean "=="?
. However, when I change the image tag to 0.5.0
I can only see the numpy error.
I can see the second issue is fixed here:
But I don't find anything related to the numpy issue.
0.4.3
1.27.x
AKS
Collecting chroma-hnswlib
Downloading chroma_hnswlib-0.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (252 bytes)
Collecting numpy (from chroma-hnswlib)
Downloading numpy-2.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.9/60.9 kB 3.9 MB/s eta 0:00:00
Downloading chroma_hnswlib-0.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/2.4 MB 115.1 MB/s eta 0:00:00
Downloading numpy-2.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.3/19.3 MB 103.2 MB/s eta 0:00:00
Installing collected packages: numpy, chroma-hnswlib
Attempting uninstall: numpy
Found existing installation: numpy 1.26.4
Uninstalling numpy-1.26.4:
Successfully uninstalled numpy-1.26.4
Attempting uninstall: chroma-hnswlib
Found existing installation: chroma-hnswlib 0.7.3
Uninstalling chroma-hnswlib-0.7.3:
Successfully uninstalled chroma-hnswlib-0.7.3
Successfully installed chroma-hnswlib-0.7.3 numpy-2.0.0
/chroma/./chromadb/utils/embedding_functions.py:584: SyntaxWarning: "is" with a literal. Did you mean "=="?
if self._task_type is "RETRIEVAL_DOCUMENT":
Traceback (most recent call last):
File "/chroma/venv/bin/uvicorn", line 8, in <module>
sys.exit(main())
File "/chroma/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/chroma/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/chroma/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/chroma/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/chroma/venv/lib/python3.10/site-packages/uvicorn/main.py", line 408, in main
run(
File "/chroma/venv/lib/python3.10/site-packages/uvicorn/main.py", line 576, in run
server.run()
File "/chroma/venv/lib/python3.10/site-packages/uvicorn/server.py", line 60, in run
return asyncio.run(self.serve(sockets=sockets))
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
File "/chroma/venv/lib/python3.10/site-packages/uvicorn/server.py", line 67, in serve
config.load()
File "/chroma/venv/lib/python3.10/site-packages/uvicorn/config.py", line 479, in load
self.loaded_app = import_from_string(self.app)
File "/chroma/venv/lib/python3.10/site-packages/uvicorn/importer.py", line 21, in import_from_string
module = importlib.import_module(module_str)
File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/chroma/./chromadb/__init__.py", line 3, in <module>
from chromadb.api.client import Client as ClientCreator
File "/chroma/./chromadb/api/__init__.py", line 7, in <module>
from chromadb.api.models.Collection import Collection
File "/chroma/./chromadb/api/models/Collection.py", line 7, in <module>
import chromadb.utils.embedding_functions as ef
File "/chroma/./chromadb/utils/embedding_functions.py", line 3, in <module>
from chromadb.api.types import (
File "/chroma/./chromadb/api/types.py", line 101, in <module>
ImageDType = Union[np.uint, np.int_, np.float_]
File "/chroma/venv/lib/python3.10/site-packages/numpy/__init__.py", line 397, in __getattr__
raise AttributeError(
AttributeError: `np.float_` was removed in the NumPy 2.0 release. Use `np.float64` instead.. Did you mean: 'float16'?
Support for Gateway API
Add optional config to use gateway API instead of ingress API.
No response
nice to have
No response
be able to deploy Chroma in EKS cluster
Use AWS docs and/or Terraform for the automation
N/A
I cannot run the chart without it
No response
Add images to docker hub
Improve user experience
No response
would make my life easier
No response
Hi,
Installing chroma chart by adding it to the helm repo and doing a helm install, as described in your README works fine.
helm repo add chroma https://amikos-tech.github.io/chromadb-chart/
But when I add chroma as a dependency to my Chart.yaml and do a helm dependency update
, I get an error saying
Error: Chart.yaml file missing
I added the dependency this way in my Chart.yaml:
dependencies:
- name: chromadb
version: 0.1.19
repository: "https://amikos-tech.github.io/chromadb-chart/"
I also tried using the name of the repo I had added 'helm repo add chroma https://amikos-tech.github.io/chromadb-chart/` but that gave the same error.
dependencies:
- name: chromadb
version: 0.1.19
repository: "@chroma"
Am I missing something here? Or is there an issue with pulling this chart as a dependency into my chart? Appreciate your support. Thanks!
0.4.3
1.27.x
local k8s cluster
Error: Chart.yaml file is missing
Error: INSTALLATION FAILED: chart requires kubeVersion: >= 1.23.0 <= 1.27.x which is incompatible with Kubernetes v1.28.0
0.4.3
1.27.x
kubectl version
Client Version: v1.28.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.0
No response
Endpoint metrics
Instrument API to add Prometheus middleware
No response
would make my life easier
No response
Trigger new releases of the chart via new tags/releases in GH. Include changes made (e.g. merged PR etc)
New GH WF
No response
would make my life easier
No response
Embeddings are relatively cheap but not free, and data is precious so we need a way to keep our data safe from loss in a cloud-native setting.
It should be possible to automatically back up and restore data.
No response
I cannot run the chart without it
This would take us one step closer using ChromaDB in production
Seems that after a reset the vector stores are left. This could be problematic as the DB can run out of space.
We suggest an opt-in approach where vector segments can be removed by a periodic clean up job.
We have the following setup:
chroma@chroma-chromadb-0:/index_data$ ls -latrh
total 128K
drwxr-xr-x 2 chroma chroma 4.0K Jul 29 10:45 63d9fc32-4b60-4cec-8273-ec1eff1eb8e5
drwxr-xr-x 2 chroma chroma 4.0K Jul 29 10:48 ae040612-374b-41cb-93ae-7f737bf0ee80
drwxr-xr-x 2 chroma chroma 4.0K Jul 29 10:52 f080e2b7-d8b3-4dde-812c-5d8e653111fe
drwxr-xr-x 2 chroma chroma 4.0K Jul 29 10:52 52dbd990-3e6d-458f-838e-e226a0e20a23
drwxr-xr-x 1 root root 4.0K Jul 29 11:00 ..
drwxr-xr-x 2 chroma chroma 4.0K Jul 29 11:04 06bd8d3d-9f7b-48c1-a5e9-425e24edc5f6
-rw-r--r-- 1 chroma chroma 96K Jul 29 11:04 chroma.sqlite3
drwxrwxrwx 7 root root 4.0K Jul 29 11:04 .
chroma@chroma-chromadb-0:/index_data$ sqlite3 chroma.sqlite3 "select id from segments where scope='VECTOR';"
06bd8d3d-9f7b-48c1-a5e9-425e24edc5f6
chroma@chroma-chromadb-0:/index_data$
As seen from the DB query only 06bd8d3d-9f7b-48c1-a5e9-425e24edc5f6
is an active vector segment and everything else can be "safely" removed.
We'll add a chart flag which when enabled and isPersistent is set can create a cronjob that will periodically clean up.
We don't need to reinstall the lib as it is now (0.7.3+) delivered a binary whl.
Remove pip install --force-reinstall --no-cache-dir chroma-hnswlib
from the entry point
N/A
would make my life easier
No response
Hi! 👋 Love this, thanks for making it. I tried this simple experiment on my computer. I was able to get a heartbeat. When I tried the following however:
import chromadb
client = chromadb.HttpClient(host="127.0.0.1", port="55897")
collection = client.create_collection("test")
print(collection.count())
It crashes
terminate called after throwing an instance of 'pybind11::error_already_set'
what(): SystemError: null argument to internal routine
/docker_entrypoint.sh: line 6: 13 Aborted (core dumped) uvicorn chromadb.app:app --workers 1 --host ${CHROMA_SERVER_HOST} --port ${CHROMA_SERVER_HTTP_PORT} --proxy-headers --log-config ${CHROMA_SERVER_LOG_CONFIG}
Just curious if you had run into this! thanks!
It would be nice to have some level of supply chain security
N/A
would make my life easier
No response
I want to expose my chromadb instance over the internet.
I need two things to ensure that my Chroma instance is safe:
Wait for the chroma team to add this feature
I cannot run the chart without it
No response
Supporting the new auth provider feature in Chroma
For basic auth:
No response
I cannot run the chart without it
No response
Community request on how to setup Chroma in AKS
Deployment guide
N/A
nice to have
No response
It would be nice to catch up with Chroma version
Versions 0.4.21 and 0.4.22 to be supported
No response
would make my life easier
No response
Support for latest k8s releases -https://kubernetes.io/releases/
Add new k8s releases in test matrix
No response
I cannot run the chart without it
No response
Thank you for providing this helm chart for deploying chromadb on a kubernetes cluster. I managed to make a successful deployment. But when I test the backed APIs on swagger, I notice that the POST APIs, like creating a new collection returns a 404 - Not found error. The GET APIs seem to be working fine, and so are the heartbeat, preflightcheck APIs.
I also checked via code (python) with a HttpClient, and saw the same 404 errors when trying to create a collection. I'm not sure what's going wrong, and what resource is 'not found' during creation. There are no other errors other than the 404, and I don't see any errors in the container logs either.
Appreciate any help here. Thank you!
(ChromaDB version: 0.4.24)
0.4.3
1.27.x
Managed kubernetes cluster
No response
Both the ingress port and the pod configurable port share the same value from values chromadb.serverHttpPort
Move the ingress port to its own config in the values
N/A
nice to have
No response
As a user I want to be able to have custom embedding function server based on HuggingFace Embedding Inference server - https://github.com/huggingface/text-embeddings-inference
A simple replicaset that can scale the the huggingface EF server with initial CPU support only
N/A
would make my life easier
No response
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.