I read it, I will try to summarize the idea with some intuition.
A classic LRU cache has a map from key to value, with the entries of the map arranged in a doubly linked list. When a value is accessed, it is promoted to the front of the list. So the last item in the list is least-recently accessed.
In a LFU cache, you might try the same idea, except each node also has a counter. When an item is accessed, increment its counter, and then swap with previously-higher-counter items to keep the list sorted. In this way an infrequently accessed item does not get promoted.
However this has a problem for certain access patterns. If every item has the same counter value, then accessing the last item in the list will force it to migrate up the whole list - O(N). You could use a heap instead, and get log time, but that's still not the constant time that LRU provides.
What's proposed here is to bucket items by counter. Each bucket has a linked list of entries, and each entry knows its bucket. The buckets themselves are arranged in a linked list. When an item in the N bucket is accessed, it conceptually unhooks itself from the N bucket and adds itself to the N+1 bucket, creating it if necessary.
Incidentally, a tip for writing this sort of cache: make your list circular, starting with a sentinel node (I call it the "mouth" since it eats its tail). This eliminates all the null checks in the linked list manipulation.
I suspect we're going to see more of this sort of 'discovery' at this point. We have done a lot of the fundamental research but haven't puzzled out all of the implications or applications yet. Your description is sort of reminiscent of a recipe. All the ingredients have been there for a long time, but the mix is important.
You may know what eggs and wheat and milk and butter are but if you've never seen a pancake before then boy have I got some good news for you. We can talk about croissant later once the buzz wears off.
Not to rain on anyone's parade, but this looks like a bog-standard interview-style question. It wouldn't be unusual to ask a candidate to "invent" and implement both data structures in a 45 minute session.
Not saying up this sort of thing makes for a good interview, just saying it's a cute little problem that has probably been solved a thousand times by a thousand different people.
Is there a way to make these data structures hardware friendly? So that SIMD instructions can be used, memory is traversed continuously, without multiple pointer indirections, etc?
For chaining hash tables like this you can use a linked list of buckets where each bucket allows multiple entries, and entries can be scanned with SIMD operations. The downside is it considerably complicates the insertion/update logic, so depending on the workload it may be a net performance loss. If the hash table is designed for concurrency the extra complexity can very likely lead to tricky to understand bugs or performance cliffs.
A classic LRU cache has a map from key to value, with the entries of the map arranged in a doubly linked list. When a value is accessed, it is promoted to the front of the list. So the last item in the list is least-recently accessed.
In a LFU cache, you might try the same idea, except each node also has a counter. When an item is accessed, increment its counter, and then swap with previously-higher-counter items to keep the list sorted. In this way an infrequently accessed item does not get promoted.
However this has a problem for certain access patterns. If every item has the same counter value, then accessing the last item in the list will force it to migrate up the whole list - O(N). You could use a heap instead, and get log time, but that's still not the constant time that LRU provides.
What's proposed here is to bucket items by counter. Each bucket has a linked list of entries, and each entry knows its bucket. The buckets themselves are arranged in a linked list. When an item in the N bucket is accessed, it conceptually unhooks itself from the N bucket and adds itself to the N+1 bucket, creating it if necessary.
Incidentally, a tip for writing this sort of cache: make your list circular, starting with a sentinel node (I call it the "mouth" since it eats its tail). This eliminates all the null checks in the linked list manipulation.