Comments (7)
So my proposal is only valid iff you want to split the feature in two.
I would just do it in one go, a single caching feature.
I'm interested in the duration-based caching, and you seem interested in the forever
caching... so the feature will be available for both of us at once. 🙂
I suppose "before anything" means before loading the style, wherever it comes from.
Yes.
from nitpick.
Hello @bibz, and thanks a lot for such detailed specs and comments.
- an integer
n > 0
: the cache expires aftern
seconds (or minutes, or hours, or days).
Instead of an integer, we could accept English strings and infer the datetime.timedelta
from them:
- 34 seconds =
timedelta(seconds=34)
- 20 minutes =
timedelta(minutes=20)
- 1 hour =
timedelta(hours=1)
(add thes
if the time unit is singular) - 3 days =
timedelta(days=3)
The cached styles would live in the cache directory, keyed by the hash of their URI.
Individual styles would not be cached, only the resulting generated style (that is currently living in the cache directory by the way).
Keying by URI allows to handle immutable styles painlessly.
I was thinking of a simple (maybe dumb) approach: just caching the HTTP requests using another module.
I found some alternatives:
- kevin1024/vcrpy: I used this before for automated tests, I like it
- reclosedev/requests-cache
- ionrock/cachecontrol
I still don't know which one to use.
VCRPy works nicely, but it might be large, and I only need to capture requests
calls.
A smaller module might be better.
With any of these modules, I think we wouldn't need to hash URIs nor care about the origin server.
The cache invalidation would be something like "delete the local files if the desired time has passed", and the module would just cache them again.
Do you see this working?
You sure gave a lot more thought about caching than I did.
from nitpick.
Instead of an integer, we could accept English strings and infer the datetime.timedelta from them
That is so much more usable than my original proposal. Limited to seconds / minutes / hours / days sound good from both sides, complete enough for users (at least us) and "just" a regex away for the implementation.
I was thinking of a simple (maybe dumb) approach: just caching the HTTP requests using another module.
That would make things a lot simpler, if the headers are optimal (GitHub does not play nicely for instance).
Or do you mean this would be only for time-limited caching?
Hashing the URI makes sense with regards to caching immutable content, if you cannot / don't want to rely on HTTP headers. It is a dead simple way to achieve it, no external library needed.
We also use vcrpy for testing, it's a great tool!
My original train of thoughts was to implement this in two steps: never
(default) / forever
with immutable URI hashing, and then duration-based (new default?) with request caching.
from nitpick.
Or do you mean this would be only for time-limited caching?
I meant using a module in both cache situations (forever
and time-limited).
Currently I cache files manually.
My idea:
- save remote style files locally using one of VCRPy/requests-cache/cachecontrol, and remove the current manual caching.
- if cache =
never
, clean up the cache dir before anything. - if cache =
forever
or time-limited, pass the expiration time to the caching module, and it should handle deleting the local files or not.
It is a theoretical approach.
I didn't try any of the modules yet, to see if this idea actually works.
Hashing the URI makes sense with regards to caching immutable content, if you cannot / don't want to rely on HTTP headers. It is a dead simple way to achieve it, no external library needed.
Currently I save remote files manually (Path.write_text()
), and I use slugify
to get a local file name (slugify(new_url)
).
Correct me if I didn't understand: you are suggesting to still save the files the same way, but hashing the URI instead of slugify
ing... right?
from nitpick.
Correct me if I didn't understand: you are suggesting to still save the files the same way, but hashing the URI instead of slugifying... right?
Yes, to avoid using a caching library altogether.
Again, it is a matter of scope: to support time-limited caching you want to use a caching library, which can also handle forever
caching.
So my proposal is only valid iff you want to split the feature in two.
Keeping it as one feature invalidates my original proposal, yours sound more logical 👍 : delegate all caching logic (including storing the files) to the library.
if cache = never, clean up the cache dir before anything.
I suppose "before anything" means before loading the style, wherever it comes from.
from nitpick.
Listing below the possible caching packages to be used to solve this issue.
- I'm inclined towards sdispater/cachy because it can use files for caching and has a built-in datetime expiration.
- Another option would be the more popular joblib/joblib and its Memory class. But it has no expiration.
- By using HTTP caching packages like ionrock/cachecontrol or kevin1024/vcrpy, I would have to write the cache expiration logic as well (and VCRPy seems too heavy for the task).
from nitpick.
🎉 This issue has been resolved in version 0.26.0 🎉
The release is available on:
v0.26.0
- GitHub release
Your semantic-release bot 📦🚀
from nitpick.
Related Issues (20)
- Falsy values not properly compared
- Nitpick should fail when no style is explicitly configured HOT 2
- pre-commit list keys are incomplete
- Can't configure `.flake8` comma separated value HOT 2
- Support comments in JSON configuration (aka JSONC) HOT 4
- No module named 'requests_cache.cache_control' HOT 1
- pre-commit hook should be runnable on manual hook stages HOT 3
- [docs] Explain how to make the linter know that a file is of INI format HOT 2
- [doc] Branch reference syntax is imperfect HOT 1
- contains_json with `false` doesn't work HOT 1
- Support GitHub Enterprise Server HOT 4
- Running nitpick with flakeheaven 3.2.1 results in warnings
- Pre-commit hook is incompatible with requests_cache==1.0 HOT 1
- Allow Github Apps Access Tokens in Github URLs HOT 1
- Request: sample config for `toml-sort` HOT 4
- Docs request: `github` remote style looking in subfolder
- Request: controlling `pre-commit` tab size (2 spaces vs 4 spaces) HOT 1
- `nitpick` removes YAML document start HOT 1
- Can't use remote style on Azure DevOps
- "AttributeError: 'Project' object has no attribute '_confirmed_root'" when running nitpick init in an empty directory HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nitpick.