ATS uses a tornado cache, where the write pointer just moves to the next object no matter what. So the disk cache doesn't work as a LRU in the same way as other cache servers.
The benefit is that writing is fast and it's constant time since you don't have to do an LRU lookup to pick a place to store the object. The downside is that you are creating cache misses unnecessarily.
It's never really been a problem for me in practice. If you have a lot of heartache over it, I would suggest putting a second cache tier in place. Very unlikely to strike out on both tiers.
> The downside is that you are creating cache misses unnecessarily.
Statistically, it balances out just fine. It turns out that just by controlling how objects get in to the cache, you can effect cache policy enough that eviction policies don't much matter, or at least, a "random out" isn't much different from a "LRU".
To avoid unnecessary cache writes, there's also a plugin that does implement a rudimentary LRU. Basically, you have to see some amount of traffic before being allowed to get written to the cache. This is typically done in a scenario where it's ok to hit the parent caches, or origins, once or a few times extra. It can also be a very useful way to avoid too heavy disk write load on SSD drives (which can be sensitive to excessive write wear, of course). See
I believe the term youre looking for is "cache admission policy." This is an adjunct to cache eviction, both are needed for success. I'm very curious what a highly efficient insertion policy and trivial "eviction" policy (FIFO) would look like in practice.
PS: If anyone is interested in these problems, We're Hiring.
edit: https://aws.amazon.com/careers/ or preferably drop me a line to my profile email or my username "at amazon.com" for a totally informal chat (Im an IC, not manager nor recruiter nor sales)
Yeah, with SSD's I wonder how much that really helps to improve performance vs. just no cache. Most SSD's have a lot of caching implemented internally, so disk cache can often be self defeating.
"It Depends." If youre doing "random" writes down to the block dev, like updating a filesystem, it can be very bad. You'll end up hitting the read/update/write cell issues and block other concurrent access. In general I'd worry (expect total throughput to go down, and tail latency way up) around a 10-20% write:read ratio. Conversely if youre doing sane sequential writes, say log structure merges with a 64-256KB chunk size, Id expect much less impact to your read latencies.