npm / cacache Goto Github PK
View Code? Open in Web Editor NEWnpm's content-addressable cache
License: Other
npm's content-addressable cache
License: Other
NpmCache::getCacheDbItems failed due to TypeError: buckets.map is not a function at readdirOrEmpty.then.buckets (main.js:238:3754) at at Function.reReject (main.js:4453:1407) at cacache_1.default.ls.then.catch.e (main.js:5513:7616) at code: undefined }
Looks like there is error during reading the dir.
function readdirOrEmpty (dir) {
return fs.readdir(dir).catch((err) => {
if (err.code === 'ENOENT' || err.code === 'ENOTDIR') {
return []
}
throw err
})
}
But it's not properly handled here.
Promise.resolve().then(async () => {
const buckets = await readdirOrEmpty(indexDir)
await Promise.all(buckets.map(async (bucket) => {
...
Promise rejected instead of crash.
Issue is sporadic.
node:19.6.0-alpine
import cacache from "cacache"
const cachePath = "./cache"
const key = "test-key"
await cacache.put(cachePath, key, "10293801983029384")
const cached = await cacache.get(cachePath, key)
console.log(`Data: ${cached.data}`)
await cacache.rm.all(cachePath)
const cached2 = await cacache.get(cachePath, key)
console.log(`Data after rm.all: ${cached2.data}`)
Output
Data: 10293801983029384
Data after rm.all: 10293801983029384
After rm.all
cache keys still exist and nothing is delete from the filesystem.
Exception: NotFoundError: No cache entry for test-key found in ./cache
Works as expected in version 16.1.3
No response
Documentation says about property time: Timestamp the entry was first added on.
Running cacache.verify()
internally uses insert()
method and sets time property to Date.now()
.
This is not what user expects and it makes the time property unusable.
const cachePath = '/tmp/cacache'
const key = 'key'
await cacache.put(cachePath, key, 'hello')
const item1 = await cacache.get.info(cachePath, key)
await cacache.verify(cachePath)
const item2 = await cacache.get.info(cachePath, key)
console.log(item1.time, item2.time, item1.time === item2.time ? 'ok' : 'WRONG')
Hi i would like to replace the move-concurrently package by a new one (crafted from scratch). Move-concurrently package and his dependencies are outdated with a lot of legacy code.
(dead emoji in my screenshot mean no update since one years and more with dependencies not up to date too).
I guess the package will only need rimraf to support Node.js 10 (after EOL we can move with the new fs.rmdir recursive).
Do you think this is a good idea ? Or maybe you prefer to go on updating the packages ?
Best Regards,
Thomas
After updating to the v12 and + (the current version I use is 12.0.2) I get the following error on put operations:
{ [Error: EPERM: operation not permitted, lchown '/tmp/content-v2/sha512/ec/9e']
cause:
{ Error: EPERM: operation not permitted, lchown '/tmp/content-v2/sha512/ec/9e'
errno: -1,
code: 'EPERM',
syscall: 'lchown',
path: '/tmp/content-v2/sha512/ec/9e' },
isOperational: true,
errno: -1,
code: 'EPERM',
syscall: 'lchown',
path: '/tmp/content-v2/sha512/ec/9e' }
And I actually see the files and folders created.
With both versions (11+ and 12+) the owner of the folders is my user.
And here is the code portion I use:
import { put as putCache } from "cacache/en";
const CACHE_PATH = "/tmp";
export const put = async <T>(key: string, payload: T, ttl: number = 10_000) => {
try {
await putCache(
CACHE_PATH,
key,
JSON.stringify({
payload,
timestamp: Date.now(),
ttl,
}),
);
return payload;
} catch (error) {
console.log("Put Cache", error);
return null;
}
};
OS: macOS 10.14 and AWS Lambda
Node Version: 10.16.0 and 12.6.0
Hi @isaacs
I'm looking for a library to manage a local-file-system cache to be used in a node.js server.
Maybe I'm way off here, but I'm trying to understand if cacache can fit. The problem is that I need a TTL and (more importantly) MAX size features (So I can control the size that cache library take on fs so it will not explode) is it possible to do this with cacache? Looking at the code, I could not see a way to do this.
I've also been playing with the idea of using your lru-cache
to "mirror" the file system cache library. Bassically store the files on fs and then add them to the lru-cache. Then when I get the dispose
event I'll delete the file (So this basically mean I will have LRU cache on file system)
Since I run on docker I don't have a risk of lru-cache inRam cache to get out of sync with fs (Because if the server crashes/stop the container will be killed and recreated with new clean fs)
Does this make any sense ?
This might be a little out of cacache's lane, not sure.
npm writes a few random files into the cache folder.
When it does this, it has to reproduce the logic that cacache (now) has built-in to preserve ownership and keep track of these files.
Idea: add a misc
folder in cacache that could be used for these sorts of things. Files in this folder would not be indexed based on their content or name.
Also, the npm cli has some logic to store at most 10 debug log files, deleting old ones as new ones are added. It'd be great if cacache had a way to define a folder that can contain at most n
files, and take care of the pruning.
The argument against doing this is that it's not really a "content addressable cache" at that point, so doesn't exactly make sense to have cacache manage it. But, it does already manage tmp files, which are also not content addressable, so maybe it's not so much of a stretch?
Failing this (or in addition to it), at least, lib/util/infer-owner.js
should probably be split out into a separate module, which would at least make it easier to get the ownership right when other code writes into the cache directly.
my npmrc file contains
prefix=${XDG_DATA_HOME}/node
cache=${XDG_CACHE_HOME}/npm
init-module=${XDG_CONFIG_HOME}/npm/config/npm-init.js
cacache ignores this frequently causing a ~/.npm
directory to reappear.
cacache should always use the configured path (which in this case is ~/.cache/npm/
) when used through npm.
npm publish
are all I used npm for)~/.npm
directory reappears with the new cache.Running example failed
fetch(
'https://registry.npmjs.org/cacache/-/cacache-1.0.0.tgz'
).then(data => {
return cacache.put(cachePath, 'registry.npmjs.org|[email protected]', data)
}).then(integrity => {
console.log('integrity hash is', integrity)
})
(node:443) UnhandledPromiseRejectionWarning: TypeError: Data must be a string or a buffer
at Hash.update (crypto.js:99:16)
at algorithms.reduce (/repo/apps/allen/node-test/node_modules/ssri/index.js:322:44)
at Array.reduce ()
at Object.fromData (/repo/apps/allen/node-test/node_modules/ssri/index.js:317:21)
at write (/repo/apps/allen/node-test/node_modules/cacache/lib/content/write.js:31:20)
at Object.putData [as put] (/repo/apps/allen/node-test/node_modules/cacache/put.js:20:10)
at fetch.then.data (/repo/apps/allen/node-test/tests/cacache.js:39:20)
at
at process._tickCallback (internal/process/next_tick.js:189:7)
I think it'd be good for cacache to raise custom errors in known cases, so that it could be more clear that an error is coming from cacache for a known scenario.
This is necessary in Pacote, where it retries on ENOENT
errors, but should only do that for ENOENT
errors that indicate a cache miss, vs ENOENT
trying to read a tarball or directory.
We already duplicate the sizeError
function in a few places, so why not just have a dedicated SizeError
class?
All of the errors created should set this.cacache = true
, and a meaningful error code. Following the pattern in node-tar and node-fetch/minipass-fetch, it'd also be super handy when debugging issues that bubble up to the CLI, if we could narrow in on the subsystem involved, and could even provide more user-friendly reporting.
The dependency on "tar":"^6.0.2" leaves the cacache package on a high severity vulnerability list. Please update so that the vulnerability is patched cacache.
Upgrade tar to version 6.1.9, 5.0.10, 4.4.18 or higher.
Leave a comment
npm outlines a as of Mar 29th, 2021
y18n before versions 3.2.2, 4.0.1, and 5.0.5 is vulnerable to prototype pollution.
Process terminates with error:
node:internal/process/promises:289
triggerUncaughtException(err, true /* fromPromise */);
^
[Error: EACCES: permission denied, open 'tmp/3164ed41'] {
errno: -13,
code: 'EACCES',
syscall: 'open',
path: 'tmp/3164ed41'
}
Process does not terminate and the stream error handler is called
Run the following script. It will change the tmp
folder permission, so that an error is triggered. It can happen in other circumstances too (of of disk space, etc.)
In certain cases this results in a lingering rejected promise handleContentP
within lib/content/write.js
In the example, such case is when the readable stream has not ended.
const {Readable} = require('stream');
const {chmod} = require('fs/promises');
const {existsSync} = require('fs');
const cacache = require('cacache');
const read = (...params) => new Readable({
read(size) {
if (params.length) this.push(params.shift());
}
});
require('http').createServer().listen(() => console.log('listening'));
((async()=>{
if (existsSync('./tmp')) await chmod('./tmp', 0777);
await cacache .put('.', 'key1', '1')
await chmod('./tmp', 0555);
return read('2').pipe(cacache.put.stream('.', 'key2')).on('error', console.error);
})()).catch(console.error);
Calling ls.stream over a large store results large memory hit for all entries
An iterator which returns entries as needed
Hi! I'm trying to implement a content-addressable cache using the xxhash3
algorithm, which is not natively supported by node:crypto
. Is this possible, for example by overriding ssri
?
Links to the contributor guide and code of conduct result in 404s, they files seem to have been removed? Did you mean to link to central files instead?
mkdirp should be on 0.5.3 at least to bring in the patched minimist version.
ref: https://snyk.io/test/npm/mkdirp/0.5.0
ref: https://github.com/isaacs/node-mkdirp/releases/tag/v0.5.3
~ $ npm install yarn
npm ERR! code EACCES
npm ERR! syscall link
npm ERR! path /data/data/com.termux/files/home/.npm/_cacache/tmp/14de33dd
npm ERR! dest /data/data/com.termux/files/home/.npm/_cacache/content-v2/sha512/d1/d2/d481d770644c0c5e31275a2b952a18da6097da58f146549fb26a5f5d8ac389ffcd10db5d924df1176590499cd2d92b5c21f948efab003774723c809d2d6c
npm ERR! errno EACCES
npm ERR!
npm ERR! Your cache folder contains root-owned files, due to a bug in
npm ERR! previous versions of npm which has since been addressed.
npm ERR!
npm ERR! To permanently fix this problem, please run:
npm ERR! sudo chown -R 10427:10427 "/data/data/com.termux/files/home/.npm"
npm ERR! A complete log of this run can be found in:
npm ERR! /data/data/com.termux/files/home/.npm/_logs/2022-11-21T01_31_59_253Z-debug-0.log
NPM install package successfully.
In Termux, run
apt update && apt upgrade
apt install nodejs-lts
npm install -g [email protected]
npm install yarn
Then error message will occur.
fs.link
will try to use hard link, which is disallowed by seccomp in Android. The maintainer of termux packages have applied a patch to solve this, but when users update cacache
or npm
, the patch will not work anymore.
Related issues: termux/termux-packages#11293, termux/termux-packages#13293
Possible patch: https://github.com/termux/termux-packages/blob/master/packages/nodejs/deps-npm-node_modules-cacache-lib-util-move-file.js.patch
Glob package version outdated and needs to be updated to resolve snyk issue
Glob should be updated to version 9>
I am using pacote to download and cache, in addition to reasonably sized source libraries, large binaries (like toolchain distributions, hundreds of MB) and I would appreciate a CLI solution to enumerate the content of the cache, such that later to be able to selectively remove some cached files. For now the only method I found was to completely remove the cache, which is far from optimal.
Thank you,
Liviu
I can't upgrade a package depending of this package since I was using the same version of node (1.16.1) that the server uses, but it just throws an error that is hard to trace because incompatibility is in a subdependency.
When I run yarn install
in server
Where node is 10.16, supposedly compatible with this package
yarn throws error that move-file is not compatible
just run
nvm use 10.16
yarn install
If move-file is really needed, then update node engine constraint to >= 10.17
Related to raineorshine/npm-check-updates#651
Related to npm/pacote#41
Broken by sindresorhus/move-file#8
Can't find an easy way to tell if object by key exists without actually getting it
I was wondering if it's possible to get info for a cache entry by digest? Something like cacache.get.info.byDigest(cache, integrity)
cacache.get.info(cache, key)
only takes a key
and get.info.byDigest
doesn't exist. I see cacache.get.hasContent(cache, integrity)
which finds my cache entry but only tells me its size and some stats about the file. If it also returned the key
then I could use get.info
.
The only alternative I can really see to get the info (including metadata
which I'm really interested in) is using ls
and iterating over its output but that seems kinda crazy!
Also, I see that cache.get.byDigest
does not return metadata by design so there is probably a technical reason this is not possible?
No response
No response
No response
Hi ,@nlf , @isaacs, I’d like to report a vulnerability issue in cacache:
A vulnerability CVE-2021-27290 (high severity) detected in package ssri (>=5.2.2 <6.0.2,>=7.0.0 <8.0.1) is directly referenced by cacache 10.0.4. We noticed that such a vulnerability has been removed since cacache 11.0.1.
However, cacache's popular previous version [email protected] (1,015,201 downloads per week) is still transitively referenced by a large amount of latest versions of active and popular downstream projects (about 5,086 downstream projects, e.g., @toptal/davinci-engine 3.3.0, @toptal/davinci-syntax 6.4.0, @toptal/davinci 4.2.2, @toptal/davinci-cli-shared 1.3.4, @toptal/davinci-bootstrap 2.1.74, @3liv/[email protected], @9188/[email protected], @akala-modules/[email protected], @aquestsrl/[email protected], etc.).
As such, issue CVE-2021-27290 can be propagated into these downstream projects and expose security threats to them.
These projects cannot easily upgrade cacache from version 10.0.4 to 11.*.* . For instance, [email protected] is introduced into the above projects via the following package dependency paths:
(1) @3liv/[email protected] ➔ @3liv/[email protected] ➔ @3liv/[email protected] ➔ [email protected] ➔ [email protected] ➔ [email protected]
(2) @9188/[email protected] ➔ [email protected] ➔ [email protected] ➔ [email protected] ➔ [email protected]
(3) @akala-modules/[email protected] ➔ @akala/[email protected] ➔ [email protected] ➔ [email protected] ➔ [email protected] ➔ [email protected]
(4) @aquestsrl/[email protected] ➔ @aquestsrl/[email protected] ➔ [email protected] ➔ [email protected] ➔ [email protected] ➔ [email protected]
......
The projects such as @3liv/fero-resource-subscriber, w-webpack, @aquestsrl/create-app-cli-utils and server-static etc. which introduced [email protected], are not maintained anymore. These unmaintained packages can neither upgrade cacache nor be easily migrated by the large amount of affected downstream projects.
On behalf the downstream users, could you help us remove the vulnerability from package [email protected]?
Sorry for the inconvenience caused.
Since these unactive projects set a version constaint ~10.0.* for cacache on the above vulnerable dependency paths, if cacache removes the vulnerability from 10.0.4 and releases a new patched version [email protected],
such a vulnerability patch can be automatically propagated into the 5,086 affected downstream projects.
In [email protected], you can kindly try to perform the following upgrade:
ssri ^5.2.4 ➔ 5.2.1
;
Note:
[email protected] (<5.2.2, >=6.0.2 <7.0.0, >=8.0.1) doesn't have the vulnerability CVE-2021-27290
Thanks again for your contributions.
Best regards,
Paimon
Apparently put.stream()
with integrity checks leaves the cache in an inconsistent state, and all keys inserted with this procedure are purged by the next verify()
.
I tried to find some use cases of put.stream()
, but I could not find anything in npm, nor any tests here, so, although I don't rule out an improper use case in my code, I guess we are facing a small bug that slipped due to incomplete testing.
Always.
I'm using put.stream()
to add large archives to the cache, and I have the expected sha256 digest for each archive.
If I set opts.algorithms = ['sha256']
and I check the resulting integrity[hashAlgorithm][0].source
I get the expected digest, and everything is fine.
If I set opts.integrity
to the expected digest, the call seems ok too, it does not throw any error, the archive is saved in the cache and decompressing it is fine.
However, if I run a verify()
, the associated entries are removed, as there would be something wrong with them.
put.stream()
with opts.algorithms = ['sha256']
verify()
ls()
; the recently added key is thereput.stream()
with opts.integrity set to the digestverify()
ls()
; the key is no longer in the cacheNote: using sha256 might not be relevant, but I mentioned it to match my test.
Running verify()
after put.stream()
with integrity checks should not remove the entries from the cache.
when i install react via npm there is a -cacache-folder in root,
under that there is also a _lock folder.
this didn't happen before
that gives hard times for git, 5000 files waiting
node-modules folder has already a _cacache
I've been looking for solid, stable and well maintained Node.js file caching libraries. (Un)Fortunately, cacache was the only one that I could find.
Now, having general experience with remote caching, looking first time at cacache documentation was overwhelming. Tarball why? Integrity what? ...
After experimenting a bit around, I believe I figured out the basics and wrote a simplified API around it.
// cache.js
import path from "path";
import cacache from "cacache";
const cachePath = path.join(__dirname, ".cache");
/**
* Simplified Cache API written on top of cacache
* This interface makes using cacache bit more straightforward without
* deeper understanding of caching itself.
* @see https://github.com/npm/cacache
*/
/**
* Set or override key/value in the cache
*/
const put = async (key, value) => {
const writeAction = await cacache.put(
cachePath,
key,
JSON.stringify(value)
);
return writeAction;
};
/**
* Retrieve a value from the cache.
*/
const get = async (key) => {
const readAction = await cacache.get(cachePath, key);
/**
* Returned "data" key contains data as a Buffer object
* Convert Buffer binary contents to string
* @see https://nodejs.org/en/knowledge/advanced/buffers/how-to-use-buffers/#reading-from-buffers
*/
const dataJSON = readAction.data.toString();
const data = JSON.parse(dataJSON);
return data;
};
const remove = async (key) => {
const removeAction = await cacache.rm.entry(cachePath, key, {
removeFully: true,
});
return removeAction;
};
/**
* Clears the entire cache
*/
const destroy = async () => {
const destroyAction = await cacache.rm.all(cachePath);
return destroyAction;
};
export const cache = {
put,
get,
remove,
destroy,
};
Before putting this in a production environment, convince me why the above simplified interface is a bad idea for storing key/value pairs, where values may be string, numbers or objects with 500 properties?
Right now, if you provide multiple algorithms to content.write(), it'll error out with:
opts.algorithms only supports a single algorithm for now
It's said that for a long time. Let's support multiple algorithms!
This causes some suboptimal caching in make-fetch-happen
, because we may have an integrity value that is a sha512
, but it always caches as sha1
, so we can never have a cache hit.
In lib/content/write.js
, we always place the content in a single location based on the integrity and algorithm.
tar < 6.2.1 has been flagged by GitHub with a moderate security vulnerability, and it is recommended to upgrade to >= 6.2.1
See Denial of service while parsing a tar file due to lack of folders count validation
@npmcli/[email protected]: This functionality has been moved to @npmcli/fs
npm WARN deprecated @npmcli/[email protected]: This functionality has been moved to @npmcli/fs
Can you please make new releases when this issue is fixed?
Versions 15.0.6
and 12.0.5
(a 12.x
release would be nice because many projects depend on cache
12.x
).
ssri 5.2.2-8.0.0, fixed in 8.0.1, processes SRIs using a regular expression which is vulnerable to a denial of service. Malicious SRIs could take an extremely long time to process, leading to denial of service. This issue only affects consumers using the strict option.
The fix is to bump ssri
to 8.0.1
.
The current cache layout uses (afaict) the first 4 hex digits to make 2 levels of directories and then stores the file. So abcdefgh => /ab/cd/efgh
I believe this to be inefficient, because modern filesystems can handle millions of files in a single directory just fine.
With the current layout, the fs needs to do 3 tree lookups to find the inode of a filename: 2 in shallow trees and 1 in a deep tree. By putting all files in 1 directory, there's only 1 lookup in a slightly deeper tree.
Furthermore, unless you have 10s of thousands of files, you end up creating a lot of directories with just one or two entries in them. So you're actually increasing the required storage.
In order to safely prune old content after storing fresh content in the cache, we need some means of knowing if the old content has now become orphaned. The fastest approach to this is reference counting.
Implementation will require
EMFILE error may be thrown during npm install
, depending on the allowed file descriptors and the state of the cache before the install. An example error:
Error: EMFILE: too many open files, open '/Users/<user>/.npm/_cacache/index-v5/64/ee/136420e5adf6592619d25b411c7849220f30364ed8ba96dea19887a5d1f2'
npm install
should succeed.
{
"name": "nuxt-app",
"devDependencies": {
"nuxt": "^3.7.0"
}
}
(The error happens reliably with nuxt, but it's not related to nuxt, you can use other packages and get the same result.)
ulimit -Hn 128
(It's possible to get the error with a higher ulimit, but using a low value helps to reliably reproduce the error)rm -rf /<home-dir>/.npm/_cacache/
rm -rf node_modules
npm install
(Reset ulimit using ulimit -Hn unlimited
)
A security assessment was performed and vulnerabilities were found to dependency sane
It is requested to update from version in package-lock.json " y18n": "^4.0.0" to " y18n": "^5.0.5"
reference:
Hi!
How do you invalidate npm packages in cache? Looks like you invalidate packages only with time.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.