dennwc / cas Goto Github PK
View Code? Open in Web Editor NEWContent Addressible Storage
License: Apache License 2.0
Content Addressible Storage
License: Apache License 2.0
Hi,
Sry if i understood something wrong, i'm trying to use cas to store downloaded artifacts, version them and later use them in some software (as more powerfull replacement for the typical ad-hoc downloader shellscripts you see everywhere).
% cas pull floodgate-spigot.jar https://download.geysermc.org/v2/projects/floodgate/versions/latest/builds/latest/downloads/spigot
% cas sync floodgate-spigot.jar
% cas checkout floodgate-spigot.jar ./floodgate-spigot.jar
cas checkout
does not work anymore% cas pull floodgate-spigot.jar https://download.geysermc.org/v2/projects/floodgate/versions/latest/builds/latest/downloads/spigot
floodgate-spigot.jar = sha256:d6f3fb960861d6560259f894bd514fca37195c086d7f2c6800c4783d8cde2216
% cas sync floodgate-spigot.jar
floodgate-spigot.jar -> sha256:d6f3fb960861d6560259f894bd514fca37195c086d7f2c6800c4783d8cde2216 (up-to-date)
% cas checkout floodgate-spigot.jar ./floodgate-spigot.jar
Error: blob: invalid ref
<help msg>
2024/03/17 19:54:41 blob: invalid ref
%
the problem seems to be, that a pull where nothing is updated creates a @type: cas:WebContent
blob that has an empty ref.
% find .cas/blobs -type f -size -500 |xargs -n1 grep .
...
{
"@type": "cas:WebContent",
"url": "https://download.geysermc.org/v2/projects/floodgate/versions/latest/builds/latest/downloads/spigot",
"ref": "sha256:4aca4a66a2641967dcc4b895dd1a7453f76b47c239e139f494a80c69066e55f1",
"size": 11235940,
"etag": "09b0c6b5cc19a1618c0b30ad13327890c",
"ts": "2024-02-18T14:41:25Z"
}
{
"@type": "cas:WebContent",
"url": "https://download.geysermc.org/v2/projects/floodgate/versions/latest/builds/latest/downloads/spigot",
"ref": "",
"etag": "09b0c6b5cc19a1618c0b30ad13327890c",
"ts": "2024-02-18T14:41:25Z"
}
golang upstream installed via godeb install 1.22.1
% go version go version go1.22.1 linux/amd64
cas installed via: % go install github.com/dennwc/cas/cmd/cas@latest
beside that, is there a way in cas
to see the history / log of what file versions a pin had? (to get the old state back quickly in case something broke) or is it just grep/jq into the index objects?
Hashes like sha256
of the binary file content are highly sensitive to changes in the file, that may end up being practically inconsequential. An alternative strategy is to utilize hashes which are aware to the formatting structure of the file, and only hash the important content while ignoring the formatting.
Here are some examples:
sum
command of seqkit, which produces a content-lenient hash of FASTA
format files: https://bioinf.shenwei.me/seqkit/usage/#sumCould it be possible that CAS would support such content-formatting-lenient hashes?
Hi there,
Sorry that I keep driving by your repo with Qs... Wondering if git-annex support could be on the cas
roadmap, rather than git LFS?
I say this for no other reason, than git-annex is an alternative to git LFS support, that is also being widely used. See here for the git-annex homepage:
https://git-annex.branchable.com
See here for "DataLad", an end to end scientific wrapped on git and git-annex, might inspire some thinking towards cas
...
https://handbook.datalad.org/en/latest/basics/101-180-FAQ.html
Hi there,
I'm working on a linux HPC cluster, that doesn't have xattr available. Is cas
still able to function if xattrs are not available, or is there a workaround?
2023/02/21 10:14:51 xattr.FSet hisat2_12B1-RiboZero.merged.bam user.cas.size: operation not supported
Notably, the xattr setting at least does seem to work on macOS & the APFS filesystem:
user.cas.hash: sha256:b99ea5e9a6e80b9e3cd9f5df62bf6a6324ee79e6529384ae44099a92b630f58f
user.cas.mtime:
0000 5C 89 30 FF 0F E9 45 17 ..0...E.
user.cas.size:
0000 19 C6 3B 65 00 00 00 00 ..;e....
Hi there,
Is it possible to resolve the full path of the file from the cas
hash? (i.e. analogous to cas blob
, but returning the local filepath instead).
I'm imagining the use case where I could keep better track of large files that are identical on both a remote and local storage, but might have distinct paths / be moving around.
Feel free to say if this is a misunderstanding of how content addressable storage can/should work.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.