-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use an LRU cache for standalone funs #82
Conversation
Pull Request Test Coverage Report for Build 530Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Khepri already offers a preference for Ra query types (local, leader). In many cases, getting stale data can be acceptable. In others, a way to bypass the cache would Cache misses are OK as long as they are relatively rare (assuming there are significantly more reads than writes). |
I should clarify that this is the cache interface for transaction functions (#75) rather than the query cache (#33). For this cache, a key living too long would mean extra memory overhead and a key living too shortly would mean it would need to be re-compiled - it shouldn't affect data staleness. This implementation scales well with capacity so we should be able to tune it for the trade-off between memory consumption and cache misses without too much trouble. |
I wonder if the cache should allow passing in an eviction callback that gets called when a key is dropped from the cache? Even if a key is dropped from the cache, the standalone function's module isn't purged so memory could continuously increase unless we purge evicted modules. |
Also I updated the micro-benchmark (https://github.com/the-mikedavis/lru_bench/tree/0263ed83bc35b58d2e50a95302c849d666492238) to run in parallel and this implementation seems to scale even better under concurrent access. It's still ~1.5x slower than |
Ah, I see. Then perfect read consistency is not a goal for this cache since it does not store user data. I trust your judgement @the-mikedavis 👍 |
Closing for now - we can revisit this later if the memory consumption of standalone funs becomes more pressing. We might end up re-using or modifying the LRU cache from this branch in the query cache. This needs a bit of work: the cache should take a callback that it executes when evicting an entry so we can unload the standalone func's module. That way we also purge the memory taken up by the code as well. |
I tried out micro-benchmarking some LRU styles: the-mikedavis/lru_bench and I found one that I think works pretty well, especially for concurrent read/write throughput. I need to plug this in to khepri-benchmark to see how it fares with concurrency but the micro-benchmarks say that this cache is ~1.4x slower for
get/2
s and ~1.8x slower forput/2
s compared topersistent_term
which I think is not too bad.Mostly I had fun beating lru which is linear runtime (this implementation is logarithmic) on the cache capacity for both
put/2
andget/2
. Look how it crawls in that XX-Large case! 😄I'd like to improve the documentation and maybe rename some variables but I thought I'd open this up to discussion earlier rather than later. Plus I have some questions:
rabbit_auth_cache
ets
?And mostly: is the lack of atomicity in
put/2
andget/2
ok? A tla+ checker might point out some nasty scenarios from concurrentput/2
s andget/2
s, but given that this cache is purely for speed, I think the fidelity of the data doesn't matter much: if keys are dropped or something lives in the cache for longer than expected, it's no big deal. What do you think?connects #75