zio / zio-cache Goto Github PK
View Code? Open in Web Editor NEWA ZIO native cache with a simple and compositional interface
Home Page: https://zio.dev/zio-cache
License: Apache License 2.0
A ZIO native cache with a simple and compositional interface
Home Page: https://zio.dev/zio-cache
License: Apache License 2.0
For some business need, we might be interested in when an entry expires and gets evicted from the cache. We should provide some mechanism (such as callbacks) that users can tap into for such events.
If cache reset is added, perhaps we should provide a hook for that event too.
Use case: a cache of document collections. Each entry can have a small or large number of documents, which can be varying in size themselves. Eviction should happen because of the memory heap being to full; so if there is 500 MB 'free' then I can either allow insertion of many small document collections or perhaps just one big collection.
Currently the 'size' of the cache is just the amount of entries (weight fixed to 1), I would like to allow the lookup function to provide arbitrary weights (float/double).
Currently we cache errors when lookup function calls fail. While this can be beneficial (e.g. as a countermeasure to malicious attacks that send bogus requests for non-existing entities), there are times when such behavior is not desired. We should give users the option to opt out.
Sometimes it's desired to pre-warm the cache with entries that are known to be frequently used, as an optimization strategy. For instance, an e-commerce web site might want to load the top 100 most popular items right from the start.
It would be nice to be able to cache ZManaged
values that gets released when they get out from the cache.
I need to set expiry time for each cached item from the lookup function.
In my use case I request an auth token from a server and the response contains the token as well as its expiry time.
Because of this I need some way to set the expiry time for each item when adding it to the cache.
This will help with ZIO metrics integration for caches.
Currently we provide hits
and misses
stats. Cache count should be a useful addition to the stats.
LongAdder
for the count should suffice.capacity
in the stats even though the value is user-provided.zio.internal.MutableConcurrentQueue
- perhaps the current allocation size of the underlying data structure (if it varies/grows) is also a good stat to report.CachePolicy
EntryStats
Cache#make
)Separately, move "ttl" concerns as expirationTime
member of EntryStats
(#6 alternative idea).
Maybe we create an Estimator[Value]
that can estimate the size of a value, which can be passed into Cache.make
.
Currently the test cases are pretty simple. More complex scenarios should be covered. Additionally tests for other platforms (namely JS and native) should also be added to make sure behaviors on those platforms are within our expectation.
It would be nice to have a jitter parameter so if a number of keys is getting queried continuously, the periodic re-fetching of them spreads out a bit.
Depending on the underlying data structure, initializing it to a known size that fits user's use pattern might be a good optimization.
If I'm not mistaken, zio.internal.MutableConcurrentQueue
is currently used as the underlying data structure. I'm not familiar with the characteristics or the implementation of this data structure. Perhaps this ticket won't be applicable or useful to MutableConcurrentQueue
.
When tests are executed using IntelliJ, tests fail with the following output (varies with each run).
CacheSpec
cacheStats
Test failed after 4 iterations with input: 13
Original input before shrinking was: 296005039
• 47 was not equal to 49
hits == 49L
hits = 47
at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:22
Test failed after 4 iterations with input: 13
Original input before shrinking was: 296005039
• 53 was not equal to 51
misses == 51L
misses = 53
at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:23
refresh
method
get
or refresh
failedCacheSpec
cacheStats
Test failed after 4 iterations with input: 13
Original input before shrinking was: 296005039
• 47 was not equal to 49
hits == 49L
hits = 47
at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:22
Test failed after 4 iterations with input: 13
Original input before shrinking was: 296005039
• 53 was not equal to 51
misses == 51L
misses = 53
at /home/ravi/projects/zio-cache/zio-cache/shared/src/test/scala/zio/cache/CacheSpec.scala:23
Process finished with exit code 1
includeKeys
- A predicate on keys that should ALWAYS be cachedexcludeKeys
- A predicate on keys that should NEVER be cachedorElse
(fallback)race
(first success)onSuccess(v => ZIO(...))
onFailure(e => ZIO(...))
val lookup2 = lookup.includeKeys(List("SPECIAL_KEY1", "SPECIAL_KEY2") contains _).excludeKeys(_ == "SPECIAL_KEY3")
In some scenarios, we might want to add a value to the cache directly without calling the lookup function. An example of such business logic is we have some special values that don't exist in the database (from which the lookup function retrieves values), nonetheless we want to serve these special values by injecting them directly into the cache.
The decision to bypass lookup is most likely determined by some external conditions. Currently the Lookup
function has the following signature:
def lookup(key: Key): ZIO[Environment, Error, Value]
And this function is provided upfront during the construction of the cache, when the conditions to bypass might not be available.
We'd probably either need to express the conditions and the value to add through Environment
, or need to add additional (optional) parameters to the lookup
function.
Hi John,
why your coding style is super complex ? can you please change your coding style to something meaningful and less complex.
Thanks
In-memory caching is very desirable, but any caching mechanism should support multiple backends as well.
Although it's not common, some special circumstance might require to clear the entire cache. We should allow it if desired.
We might go one step further and allow a user-provided predicate to clear entries conditionally.
Because we do not "reload" on eviction, this would always be 1. We should probably delete it for now.
A goal should be that if all entries should expire after 1 hour, then if the cache is left alone for a sufficient amount of time, eventually, it contains no entries.
Much like there is a constructor to transform the key, it would be great to have one to transform the value. The use case I have right now is an http call where the ttl is controlled by the Cache-Control
header in the response, but the value stored in the cache is the decoded body. Does that make sense?
def makeUltimate[In, Key, Environment, Error, Result, Value](
capacity: Int,
lookup: Lookup[In, Environment, Error, Result]
)(
timeToLive: Exit[Error, Value] => Duration,
keyBy: In => Key,
valueFrom: Result => ZIO[Environment, Error, Value]
)(implicit trace: Trace): URIO[Environment, Cache[In, Error, Value]]
Home stretch: add a conditional lookup function, so that in my use case instead of doing a blanket GET
request again it would add the If-Modified-Since
header.
Many HTTP APIs have a batch endpoint. This allows multiple values to be requested with a single HTTP call.
This doesn't work well with ZIO-cache right now, as there is no way to look up multiple values at once.
HTTP endpoints are prime targets for caching, since network overhead is usually significant. So I think support for this use case would be a great addition to ZIO-cache.
I'm not sure what the best interface for this would be, but ideally, it would:
Is this intentional? It kind of confused me as a new user when I wanted to test a behavior.
It should be possible to audit a cache to figure out why values are retained or expired (and when, etc.).
When upgrading an app that uses zio-cache to ZIO 2.1.0-RC2, I get:
Exception in thread "zio-fiber-131478979" java.lang.NoSuchMethodError: 'zio.internal.MutableConcurrentQueue zio.internal.MutableConcurrentQueue$.unbounded()'
at zio.cache.Cache$CacheState$.initial(Cache.scala:369)
Would it be possible to have an RC release or something that supports ZIO 2.1? Thanks!
Currently get
is the only way to trigger a lookup call, which may or may not happen depending on whether the target entry resides in the cache. However, there are times when we want to:
At the moment, we would have to invalidate
the entry first then get
it again. This is probably not the best way to handle it. For example, a popular item is being requested constantly. If we evict it first then fetch it, during the fetch, we could receive tons of requests for this item. Even though we can handle a Thundering Herd situation, we should avoid it in the first place.
The proposal is that we can trigger an update, which runs in the background. Upon a successful retrieval, we will update the entry with the new value. During the time of retrieval, all incoming requests are served right away without delay.
Currently there is no easy way to iterate through or query against the items in the cache. There might be cases where you would like to do that (debugging comes to mind).
Maybe we can provide either an iterator or a query/filter interface to users for this purpose. However, we'll need to take potential performance impact and data consistency into account:
It doesn't seem to provide much right now and feels more like boilerplate rather than a useful abstraction. Is there a plan to add other types of lookups?
Hi!
While working with zio-cache I ran into a problem where Key
alone does not provide enough information
to compute the cached Value
.
As a motivating example let's assume that some request contains the user ID and some other data extracted
from a session cookie.
case class CookieData(data:String)
case class Request(userId:Int, cookieData:CookieData)
trait UserSessionDataComputationService {
type UserSessionData
def expensiveUserSessionDataComputation(request:Request):UIO[UserSessionData]
}
A cache lookup can trigger a expensiveUserSessionDataComputation
call to compute the cached value.
With the current version of zio-cache
, we can set Key
to Int
,
Value
to UserSessionData
and
Environment
to UserSessionDataComputationService
.
To run the effect and construct the cache we must provide a UserSessionDataComputationService
once.
However, we cannot access the Request
instance.
We cannot solve the problem by making Request
part of the environment,
because the request would only be set once during cache creation instead of cache lookup.
One solution is to update the interface of Cache's get
,lookupValue
and refresh
methods to return
a ZIO[Environment,Error,Value]
instead of IO[Error,Value]
.
In the example given above Environment
is set to Request
only.
UserSessionDataComputationService
can be provided as input for a cache layer.
This branch shows a possible implementation with a demo app:
https://github.com/landlockedsurfer/zio-cache/commits/lookup-environment
Another solution is to project out the key from a given input. A new type variable Input
is introduced
and a keyByInput
function passed on cache creation to extract the key from the given input.
This branch shows a possible implementation for this solution including a demo app:
https://github.com/landlockedsurfer/zio-cache/commits/key-by-input
What do you think?
Kind regards,
Manfred
Is it possible to tweak the signature of get
to know when the result I get
has been calculated by my query or another request that happened to arrive before mine?
We need it to refine our retry policy client-side.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.