Code Monkey home page Code Monkey logo

Comments (7)

andreoliwa avatar andreoliwa commented on June 13, 2024 1

So my proposal is only valid iff you want to split the feature in two.

I would just do it in one go, a single caching feature.
I'm interested in the duration-based caching, and you seem interested in the forever caching... so the feature will be available for both of us at once. 🙂

I suppose "before anything" means before loading the style, wherever it comes from.

Yes.

from nitpick.

andreoliwa avatar andreoliwa commented on June 13, 2024

Hello @bibz, and thanks a lot for such detailed specs and comments.

  • an integer n > 0: the cache expires after n seconds (or minutes, or hours, or days).

Instead of an integer, we could accept English strings and infer the datetime.timedelta from them:

  • 34 seconds = timedelta(seconds=34)
  • 20 minutes = timedelta(minutes=20)
  • 1 hour = timedelta(hours=1) (add the s if the time unit is singular)
  • 3 days = timedelta(days=3)

The cached styles would live in the cache directory, keyed by the hash of their URI.
Individual styles would not be cached, only the resulting generated style (that is currently living in the cache directory by the way).
Keying by URI allows to handle immutable styles painlessly.

I was thinking of a simple (maybe dumb) approach: just caching the HTTP requests using another module.
I found some alternatives:

I still don't know which one to use.
VCRPy works nicely, but it might be large, and I only need to capture requests calls.
A smaller module might be better.

With any of these modules, I think we wouldn't need to hash URIs nor care about the origin server.
The cache invalidation would be something like "delete the local files if the desired time has passed", and the module would just cache them again.

Do you see this working?
You sure gave a lot more thought about caching than I did.

from nitpick.

bibz avatar bibz commented on June 13, 2024

Instead of an integer, we could accept English strings and infer the datetime.timedelta from them

That is so much more usable than my original proposal. Limited to seconds / minutes / hours / days sound good from both sides, complete enough for users (at least us) and "just" a regex away for the implementation.

I was thinking of a simple (maybe dumb) approach: just caching the HTTP requests using another module.

That would make things a lot simpler, if the headers are optimal (GitHub does not play nicely for instance).
Or do you mean this would be only for time-limited caching?

Hashing the URI makes sense with regards to caching immutable content, if you cannot / don't want to rely on HTTP headers. It is a dead simple way to achieve it, no external library needed.

We also use vcrpy for testing, it's a great tool!


My original train of thoughts was to implement this in two steps: never (default) / forever with immutable URI hashing, and then duration-based (new default?) with request caching.

from nitpick.

andreoliwa avatar andreoliwa commented on June 13, 2024

Or do you mean this would be only for time-limited caching?

I meant using a module in both cache situations (forever and time-limited).
Currently I cache files manually.
My idea:

  • save remote style files locally using one of VCRPy/requests-cache/cachecontrol, and remove the current manual caching.
  • if cache = never, clean up the cache dir before anything.
  • if cache = forever or time-limited, pass the expiration time to the caching module, and it should handle deleting the local files or not.

It is a theoretical approach.
I didn't try any of the modules yet, to see if this idea actually works.

Hashing the URI makes sense with regards to caching immutable content, if you cannot / don't want to rely on HTTP headers. It is a dead simple way to achieve it, no external library needed.

Currently I save remote files manually (Path.write_text()), and I use slugify to get a local file name (slugify(new_url)).
Correct me if I didn't understand: you are suggesting to still save the files the same way, but hashing the URI instead of slugifying... right?

from nitpick.

bibz avatar bibz commented on June 13, 2024

Correct me if I didn't understand: you are suggesting to still save the files the same way, but hashing the URI instead of slugifying... right?

Yes, to avoid using a caching library altogether.

Again, it is a matter of scope: to support time-limited caching you want to use a caching library, which can also handle forever caching.

So my proposal is only valid iff you want to split the feature in two.
Keeping it as one feature invalidates my original proposal, yours sound more logical 👍 : delegate all caching logic (including storing the files) to the library.

if cache = never, clean up the cache dir before anything.

I suppose "before anything" means before loading the style, wherever it comes from.

from nitpick.

andreoliwa avatar andreoliwa commented on June 13, 2024

Listing below the possible caching packages to be used to solve this issue.

  1. I'm inclined towards sdispater/cachy because it can use files for caching and has a built-in datetime expiration.
  2. Another option would be the more popular joblib/joblib and its Memory class. But it has no expiration.
  3. By using HTTP caching packages like ionrock/cachecontrol or kevin1024/vcrpy, I would have to write the cache expiration logic as well (and VCRPy seems too heavy for the task).

from nitpick.

github-actions avatar github-actions commented on June 13, 2024

🎉 This issue has been resolved in version 0.26.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

from nitpick.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.