Caches generally need to be bounded to be effective. What am I not getting here?

skuhn · on Jan 28, 2016

ATS uses a tornado cache, where the write pointer just moves to the next object no matter what. So the disk cache doesn't work as a LRU in the same way as other cache servers.

The benefit is that writing is fast and it's constant time since you don't have to do an LRU lookup to pick a place to store the object. The downside is that you are creating cache misses unnecessarily.

It's never really been a problem for me in practice. If you have a lot of heartache over it, I would suggest putting a second cache tier in place. Very unlikely to strike out on both tiers.

cbsmith · on Jan 29, 2016

> The downside is that you are creating cache misses unnecessarily.

Statistically, it balances out just fine. It turns out that just by controlling how objects get in to the cache, you can effect cache policy enough that eviction policies don't much matter, or at least, a "random out" isn't much different from a "LRU".

lhedstrom · on Jan 29, 2016

To avoid unnecessary cache writes, there's also a plugin that does implement a rudimentary LRU. Basically, you have to see some amount of traffic before being allowed to get written to the cache. This is typically done in a scenario where it's ok to hit the parent caches, or origins, once or a few times extra. It can also be a very useful way to avoid too heavy disk write load on SSD drives (which can be sensitive to excessive write wear, of course). See

https://github.com/apache/trafficserver/tree/master/plugins/...

donavanm · on Jan 29, 2016

I believe the term youre looking for is "cache admission policy." This is an adjunct to cache eviction, both are needed for success. I'm very curious what a highly efficient insertion policy and trivial "eviction" policy (FIFO) would look like in practice.

Cache insertion research is generally focused on use cases like small associative hardware caches. There's very little applicable public research for larger software caching systems, that Ive found. Probably the best would be Gil Einziger. He appears to have found it as an application of his work on extremely space/time efficient counting of sets, http://www.graduate.technion.ac.il/Theses/Abstracts.asp?Id=2... and http://www.cs.technion.ac.il/~gilga/. Of notable mention is TinyLFU http://www.cs.technion.ac.il/~gilga/TinyLFU_PDP2014.pdf. Gil submitted it to Caffeine (Java caching library) last summer, https://github.com/ben-manes/caffeine/pull/24. It got some traction over the winter and is now showing up in other places (https://issues.apache.org/jira/browse/CASSANDRA-10855). In fact Ben Manes just had a guest post on High Scalability the other day http://highscalability.com/blog/2016/1/25/design-of-a-modern....

PS: If anyone is interested in these problems, We're Hiring.

edit: https://aws.amazon.com/careers/ or preferably drop me a line to my profile email or my username "at amazon.com" for a totally informal chat (Im an IC, not manager nor recruiter nor sales)

NovaX · on Jan 29, 2016

TinyLFU+FIFO is in the simulator (https://github.com/ben-manes/caffeine/wiki/Simulator). However, you'd probably also want the window cache to correct the deficiency outlined in the updated paper (http://arxiv.org/pdf/1512.00727.pdf). A FIFO version with that isn't in the simulator, but would be easy to add.

mutli1 trace - W-TinyLFU: 55.6% - TinyLFU+Random: 53.8% - TinyLFU+FIFO: 48.6% - TinyLFU+LRU: 54.8% - FIFO: 41.0% - LRU: 46.7%

FIFO's poor choice of a victim has a big impact, so it doesn't seem promising. A random policy looks like an attractive fit, though.

mvc · on Jan 29, 2016

Who's "we"?

cbsmith · on Jan 29, 2016

Yeah, with SSD's I wonder how much that really helps to improve performance vs. just no cache. Most SSD's have a lot of caching implemented internally, so disk cache can often be self defeating.

donavanm · on Jan 29, 2016

"It Depends." If youre doing "random" writes down to the block dev, like updating a filesystem, it can be very bad. You'll end up hitting the read/update/write cell issues and block other concurrent access. In general I'd worry (expect total throughput to go down, and tail latency way up) around a 10-20% write:read ratio. Conversely if youre doing sane sequential writes, say log structure merges with a 64-256KB chunk size, Id expect much less impact to your read latencies.

cbsmith · on Jan 29, 2016

This is a read cache though. If you just don't have it, you do reads on the SSD, which are pretty darn quick...

donavanm · on Jan 29, 2016

Ah, i must have missed the thread intent. I thought you were responding to the utility of limiting/optimizing cache write rate.

cbsmith · on Jan 29, 2016

You got that right. It's a funky cache. ;-)