Comments (3)
Hi @NilsKrattinger We would recommend continuing to use CVMFS, for several reasons.
- In general, cvmfs has been the main reference data source used by the galaxy community.
- We considered alternatives because there was a period in which the csi for cvmfs in particular lacked maintenance. This has been resolved now, and we are seeing regular updates.
- s3-csi (and alternatives like Yandex), does not perform as well as cvmfs when dealing with lots of small files. This is because each file requires a request, whereas cvmfs uses a larger block based format.
- The above + the fact that cvmfs is generally optimized for read-only data, and the s3 alternatives are not, resulted in worse performance for refdata. We observed a dramatic increase in startup times after switching to it. Possible future optimizations may be able to remedy this to some degree, but at least for the time being, it's the slower option.
For these reasons, we have switched back to cvmfs as the refdata repository of choice. Ideally however, the choice should not matter, as it should (at least in principle) be trivially possible to switch from one to the other just by enabling/disabling the corresponding option, and the chart will just use the new source.
from galaxy-helm.
@nuwang Thanks for sharing the reasons and story behind this, really appreciate it!
As for the low-coupling between refdata and the storage solution, i fully agreed that the CSI / Storage solution should ideally not matter, as long as it support ROX
operations.
Going down that road, would something like a refdata.storageCalss
attribute in Values.yaml
be desirable?
To not break compatibility i think that, this would have to by override by cvmfs.enable
and S3FS.enbale
, or it could be defaulted to cvmfs
/ s3fs
when left empty?
from galaxy-helm.
Yes, the storage class can currently be specified and it will be used when creating the pvc:
However, I just realised that changing the storageclass on a pvc midstream is not supported by k8s, so we should probably add a feature to allow the pvc itself to be injected externally. When done, it should be possible to switch the pvc transparently - with running jobs continuing to use the old pvc, and new jobs just switching to the new pvc.
from galaxy-helm.
Related Issues (20)
- Node selectors
- Error in Galaxy-helm installation HOT 2
- Cannot Download output file from Galaxy HOT 16
- Release notes not generated
- Committer identity unknown
- Move the probedb.py script HOT 5
- Update galaxy-cvmfs-csi HOT 1
- tag-and-release job does not checkout latest commits
- File permission problems with restricted NFS mounts
- Cannot delete files/folders in the galaxy database owned by root users HOT 13
- Remove objects read permission HOT 13
- Update README for interactive tools
- CNPG for postgres? HOT 2
- Postgres error while updating Galaxy from helm version 5.3.1. to 5.5.0 HOT 1
- Multiple/many parallel jobs lead to "random" failures HOT 10
- GKE: storageclass.storage.k8s.io "nfs" not found HOT 5
- `post-install-cvmfs-fix` dosent-work when CVMFS is deployed as external dependency HOT 1
- GKE: Canβt find a Persistent Volume Claim that the Pod requires. HOT 1
- kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name cvmfs.csi.cern.ch not found in the list of registered CSI drivers
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from galaxy-helm.