Code Monkey home page Code Monkey logo

zio-cache's Issues

Ability to add callback/hook to expiry events

Motivation

For some business need, we might be interested in when an entry expires and gets evicted from the cache. We should provide some mechanism (such as callbacks) that users can tap into for such events.

Considerations

If cache reset is added, perhaps we should provide a hook for that event too.

Alternative cost/weight per entry

Use case: a cache of document collections. Each entry can have a small or large number of documents, which can be varying in size themselves. Eviction should happen because of the memory heap being to full; so if there is 500 MB 'free' then I can either allow insertion of many small document collections or perhaps just one big collection.

Currently the 'size' of the cache is just the amount of entries (weight fixed to 1), I would like to allow the lookup function to provide arbitrary weights (float/double).

Option to turn off caching errors

Motivation

Currently we cache errors when lookup function calls fail. While this can be beneficial (e.g. as a countermeasure to malicious attacks that send bogus requests for non-existing entities), there are times when such behavior is not desired. We should give users the option to opt out.

Ability to preload the cache at start time

Motivation

Sometimes it's desired to pre-warm the cache with entries that are known to be frequently used, as an optimization strategy. For instance, an e-commerce web site might want to load the top 100 most popular items right from the start.

Considerations

  1. We'll probably need a separate lookup function to retrieve these entries and populate the cache.
  2. It's probably a good idea to stagger the provided TTL for each entry so that we can avoid the situation where these hotspot entries all expire at the same time and subsequent requests can trigger an avalanche of retrieval.

Ability to set expiry time from lookup function

I need to set expiry time for each cached item from the lookup function.
In my use case I request an auth token from a server and the response contains the token as well as its expiry time.
Because of this I need some way to set the expiry time for each item when adding it to the cache.

Provide more stats in `CacheStats`

Motivation

Currently we provide hits and misses stats. Cache count should be a useful addition to the stats.

Considerations

  1. Adding another LongAdder for the count should suffice.
  2. Maybe for completeness, we can include capacity in the stats even though the value is user-provided.
  3. Not sure this is applicable to zio.internal.MutableConcurrentQueue - perhaps the current allocation size of the underlying data structure (if it varies/grows) is also a good stat to report.

Refactor Evict

  1. Pull out of CachePolicy
  2. Simplify so it cannot look at time or EntryStats
  3. Pass it to cache constructor (Cache#make)

Separately, move "ttl" concerns as expirationTime member of EntryStats (#6 alternative idea).

Add memory estimation

Maybe we create an Estimator[Value] that can estimate the size of a value, which can be passed into Cache.make.

Optional jitter for the cache ttl

It would be nice to have a jitter parameter so if a number of keys is getting queried continuously, the periodic re-fetching of them spreads out a bit.

Ability to set an initial size in addition to `capacity`?

Motivation

Depending on the underlying data structure, initializing it to a known size that fits user's use pattern might be a good optimization.

Considerations

If I'm not mistaken, zio.internal.MutableConcurrentQueue is currently used as the underlying data structure. I'm not familiar with the characteristics or the implementation of this data structure. Perhaps this ticket won't be applicable or useful to MutableConcurrentQueue.

Test fails when executed in IntelliJ

When tests are executed using IntelliJ, tests fail with the following output (varies with each run).

  • CacheSpec

    • cacheStats
      Test failed after 4 iterations with input: 13
      Original input before shrinking was: 296005039
      • 47 was not equal to 49
      hits == 49L
      hits = 47
      at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:22

      Test failed after 4 iterations with input: 13
      Original input before shrinking was: 296005039
      • 53 was not equal to 51
      misses == 51L
      misses = 53
      at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:23

    • invalidate
    • invalidateAll
    • lookup
      • sequential
      • concurrent
      • capacity
    • refresh method
      • should update the cache with a new value
      • should update the cache with a new value even if the last get or refresh failed
      • should get the value if the key doesn't exist in the cache
    • size
      Ran 10 tests in 4 s 263 ms: 9 succeeded, 0 ignored, 1 failed
  • CacheSpec

    • cacheStats
      Test failed after 4 iterations with input: 13
      Original input before shrinking was: 296005039
      • 47 was not equal to 49
      hits == 49L
      hits = 47
      at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:22

      Test failed after 4 iterations with input: 13
      Original input before shrinking was: 296005039
      • 53 was not equal to 51
      misses == 51L
      misses = 53
      at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:23

Process finished with exit code 1

Add operators to Lookup

  • includeKeys - A predicate on keys that should ALWAYS be cached
  • excludeKeys - A predicate on keys that should NEVER be cached
  • Combining lookup functions?
    • orElse (fallback)
    • race (first success)
  • Unary operators
    • onSuccess(v => ZIO(...))
    • onFailure(e => ZIO(...))
val lookup2 = lookup.includeKeys(List("SPECIAL_KEY1", "SPECIAL_KEY2") contains _).excludeKeys(_ == "SPECIAL_KEY3")

Ability to add cache entries directly, bypassing the `Lookup` function

Motivation

In some scenarios, we might want to add a value to the cache directly without calling the lookup function. An example of such business logic is we have some special values that don't exist in the database (from which the lookup function retrieves values), nonetheless we want to serve these special values by injecting them directly into the cache.

Considerations

The decision to bypass lookup is most likely determined by some external conditions. Currently the Lookup function has the following signature:

def lookup(key: Key): ZIO[Environment, Error, Value]

And this function is provided upfront during the construction of the cache, when the conditions to bypass might not be available.

We'd probably either need to express the conditions and the value to add through Environment, or need to add additional (optional) parameters to the lookup function.

Complex Coding ?

Hi John,

why your coding style is super complex ? can you please change your coding style to something meaningful and less complex.

Thanks

Ability to clear the cache

Motivation

Although it's not common, some special circumstance might require to clear the entire cache. We should allow it if desired.

Considerations

We might go one step further and allow a user-provided predicate to clear entries conditionally.

Call CachingPolicy#evict more often

A goal should be that if all entries should expire after 1 hour, then if the cache is left alone for a sufficient amount of time, eventually, it contains no entries.

Ability to transform value

Much like there is a constructor to transform the key, it would be great to have one to transform the value. The use case I have right now is an http call where the ttl is controlled by the Cache-Control header in the response, but the value stored in the cache is the decoded body. Does that make sense?

def makeUltimate[In, Key, Environment, Error, Result, Value](
    capacity: Int,
    lookup: Lookup[In, Environment, Error, Result]
  )(
    timeToLive: Exit[Error, Value] => Duration,
    keyBy: In => Key,
    valueFrom: Result => ZIO[Environment, Error, Value]
  )(implicit trace: Trace): URIO[Environment, Cache[In, Error, Value]]

Home stretch: add a conditional lookup function, so that in my use case instead of doing a blanket GET request again it would add the If-Modified-Since header.

Allow batch lookups

Many HTTP APIs have a batch endpoint. This allows multiple values to be requested with a single HTTP call.

This doesn't work well with ZIO-cache right now, as there is no way to look up multiple values at once.

HTTP endpoints are prime targets for caching, since network overhead is usually significant. So I think support for this use case would be a great addition to ZIO-cache.

I'm not sure what the best interface for this would be, but ideally, it would:

  • Use cached values for keys already present in the cache;
  • Call the user-defined batch function with the remaining keys (if any);
  • Add new entries to the cache;
  • Enforce at the type level that the user-defined batch function returns a value for every key.

Add hooks for auditing

It should be possible to audit a cache to figure out why values are retained or expired (and when, etc.).

Compat with ZIO 2.1.0-RC2

When upgrading an app that uses zio-cache to ZIO 2.1.0-RC2, I get:

    Exception in thread "zio-fiber-131478979" java.lang.NoSuchMethodError: 'zio.internal.MutableConcurrentQueue zio.internal.MutableConcurrentQueue$.unbounded()'
    	at zio.cache.Cache$CacheState$.initial(Cache.scala:369)

Would it be possible to have an RC release or something that supports ZIO 2.1? Thanks!

Ability to trigger a lookup call deliberately

Motivation

Currently get is the only way to trigger a lookup call, which may or may not happen depending on whether the target entry resides in the cache. However, there are times when we want to:

  1. refresh a cache entry to its most up-to-date value from our persistence store (it could have changed since the last retrieval)
  2. we simply want to extend the TTL of an entry by repopulating it
  3. we want both 1) and 2)

At the moment, we would have to invalidate the entry first then get it again. This is probably not the best way to handle it. For example, a popular item is being requested constantly. If we evict it first then fetch it, during the fetch, we could receive tons of requests for this item. Even though we can handle a Thundering Herd situation, we should avoid it in the first place.

The proposal is that we can trigger an update, which runs in the background. Upon a successful retrieval, we will update the entry with the new value. During the time of retrieval, all incoming requests are served right away without delay.

Ability to iterate/query items in the cache

Motivation

Currently there is no easy way to iterate through or query against the items in the cache. There might be cases where you would like to do that (debugging comes to mind).

Considerations

Maybe we can provide either an iterator or a query/filter interface to users for this purpose. However, we'll need to take potential performance impact and data consistency into account:

  1. Any query only represents a snapshot of the cache at a particular moment.
  2. We probably shouldn't keep additional copies of data to hold this snapshot just for this purpose

Delete Lookup?

It doesn't seem to provide much right now and feels more like boilerplate rather than a useful abstraction. Is there a plan to add other types of lookups?

Enhance cache lookup

Hi!

While working with zio-cache I ran into a problem where Key alone does not provide enough information
to compute the cached Value.

As a motivating example let's assume that some request contains the user ID and some other data extracted
from a session cookie.

case class CookieData(data:String)
case class Request(userId:Int, cookieData:CookieData)

trait UserSessionDataComputationService {
  type UserSessionData
  def expensiveUserSessionDataComputation(request:Request):UIO[UserSessionData]
}

A cache lookup can trigger a expensiveUserSessionDataComputation call to compute the cached value.

With the current version of zio-cache, we can set Key to Int,
Value to UserSessionData and
Environment to UserSessionDataComputationService.
To run the effect and construct the cache we must provide a UserSessionDataComputationService once.
However, we cannot access the Request instance.
We cannot solve the problem by making Request part of the environment,
because the request would only be set once during cache creation instead of cache lookup.

One solution is to update the interface of Cache's get,lookupValue and refresh methods to return
a ZIO[Environment,Error,Value] instead of IO[Error,Value].
In the example given above Environment is set to Request only.
UserSessionDataComputationService can be provided as input for a cache layer.

This branch shows a possible implementation with a demo app:
https://github.com/landlockedsurfer/zio-cache/commits/lookup-environment

Another solution is to project out the key from a given input. A new type variable Input is introduced
and a keyByInput function passed on cache creation to extract the key from the given input.

This branch shows a possible implementation for this solution including a demo app:
https://github.com/landlockedsurfer/zio-cache/commits/key-by-input

What do you think?

Kind regards,
Manfred

Refined result from `get`

Is it possible to tweak the signature of get to know when the result I get has been calculated by my query or another request that happened to arrive before mine?
We need it to refine our retry policy client-side.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.