Change PickleCache from simple LRU to a more modern policy (like windowed LFU/SLRU)

[Over in RelStorage](https://github.com/zodb/relstorage/issues/127) we've been having a discussion based on the observation that strict LRU isn't necessarily a good cache policy for varying workloads. 

Basically, if most queries/transactions/requests use a certain set of objects, those objects shouldn't be evicted from the cache just because an outlier request comes in that scans a different BTree. LRU is prone to that problem. More adaptive approaches aren't. 

@ben-manes [pointed us](https://github.com/zodb/relstorage/issues/127#issuecomment-246895480) to a policy that [can be near optimal](https://github.com/ben-manes/caffeine/wiki/Efficiency) for a variety of workloads.

I think that would be a good fit for the PickleCache.

I have a [C](https://github.com/zodb/relstorage/blob/w-tinylru-cffi/relstorage/cache/cache_ring.h) and [CFFI](https://github.com/zodb/relstorage/blob/w-tinylru-cffi/relstorage/cache/cache_ring.py) implementation that I'm using in RelStorage. We may or may not be able to directly share the code, I don't know (it'd be cool to split this off as its own library on PyPI and distribute binary wheels just for it so I don't have to do that for RelStorage), but we could at least start with it.

Other notes:
- The code as is doesn't implement the probability frequency sketch. This feature cheaply allows the cache to have a "memory" of keys it has evicted so if they show up again soon it knows it made the wrong decision and is more biased towards keeping them. Given the way keys change in RelStorage simulations showed this feature wasn't particularly useful there, but since keys are unchanging OIDS in this cache, it might be more useful here. But that can always be added later.
- Garbage collection (`incrgc`) and tuning become simpler, as the cache always stays at its preferred size, automatically, based on the frequency and LRU-ness of the data. We know _exactly_ which items are good to evict from the cache. The `drain_resistance` goes away. 
- Of course, this is complicated by the fact that the pickle cache has the option to be based on estimated byte size _in addition to_ object count.  Ideally one of those would go away to be able to take full advantage of the existing code (plus the combination makes it difficult to reason about cache behaviour). My vote is for eliminating byte size at this level; let the lower levels worry about caching in terms of bytes.
- Does anybody have some cache traces from an actual ZODB production instance, and info on where they came from? That could be plugged into simulations.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change PickleCache from simple LRU to a more modern policy (like windowed LFU/SLRU) #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Change PickleCache from simple LRU to a more modern policy (like windowed LFU/SLRU) #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions