Code Monkey home page Code Monkey logo

Comments (6)

JWCook avatar JWCook commented on June 23, 2024 2

FYI, a better option for using requests-cache is using CachedSession directly, instead of patching with install_cache(). It makes it more explicit what you are and aren't caching, and doesn't affect downstream requests calls. It's mostly thread-safe, except for cache_disabled(), as you noted; instead, when you want to make a non-cached request, you can just use a regular requests.Session.

Also, the library is being maintained again (with a fairly large release last week), and issues and PRs are welcome.

from hydrotools.

jarq6c avatar jarq6c commented on June 23, 2024

With reference to CacheControl, the default cache is a dict. They also have a FileCache that will persist requests using pickle and a RedisCache. I'm wondering about performance using the FileCache and portability using the RedisCache.

from hydrotools.

aaraney avatar aaraney commented on June 23, 2024

With reference to CacheControl, the default cache is a dict. They also have a FileCache that will persist requests using pickle and a RedisCache. I'm wondering about performance using the FileCache and portability using the RedisCache.

Thanks for doing a little digging... Having not done that so far myself, the issues that you mentioned are problematic and do not sound like they would support our needs. Specifically a FileCache does not support our needs nor an in-RAM solution like Redis. Additionally, I am not overly keen in bringing in another dependency just to fix this one issue. Plus with the status of requests-cache:

Noticed last commit was 14-Aug-2019. Which is almost 1.5 years ago as of today.

#14 may deem it necessary to develop our own cache sub-package. For reference, this is something @jarq6c and @aaraney have previously discussed, but never formally tabled. At face value it seems that we should be able to learn from the implementations of both CacheControl and requests-cache and sort of merge them. I think having something a little more general, as we've discussed prior, that permits caching things in general via HD5 as well as SQlite would be a requirement for such a feature.

With all of the being said, I don't want this issue to dissolve into another subject, but just speak generally and voice the potential feature addition to address this bug.

from hydrotools.

jarq6c avatar jarq6c commented on June 23, 2024

The disadvantages of a hypothetical evaluation_tools._cache package are that someone has to write and maintain it. We may also end up reinventing the wheel to some extent. On the other hand, perhaps there is a way to design it such that wheel reinventing is minimized and we're really just writing wrappers around existing libraries.

The potential advantages of such a package go beyond its primary use to cache NWIS and NWM data. We could unify caching and storing of data through a singular interface. This might allow data/computational scientists using evaluation_tools to speed up development of long workflows by storing interim data. This interface could be incorporated into an event_detection_service that caches events. If an object store (or cloud bucket) became available we could use this package to automatically push data from any other evaluation_tools subpackage.

I'm imagining a subpackage that takes evaluation_tools canonical pandas.DataFrames or geopandas.GeoDataFrames and automagically caches and/or stores the data in a variety of formats.

from hydrotools.

hellkite500 avatar hellkite500 commented on June 23, 2024

@jarq6c its ironic you mention an event cache, I may or may not have had the need to implement:

class EventCache(object):
    """
        Event cache manager
        FIXME current multiprocessing works because of a guarantee of process order,
        but this may need to be better protected from reader/writer issues
    """

I would encourage thinking about the needs of caching and what exactly is being cached. A dataframe cache with clear semantics isn't too far fetched.

from hydrotools.

aaraney avatar aaraney commented on June 23, 2024

Because of #78, multiprocessing usage should become more obsolete in the context of retrieving data. With that in mind, I am going to close this and we can reopen in the future to continue discussion.

from hydrotools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.